Artificial Intelligence

NYT v OpenAI: OpenAI In Trouble?

OpenAI, known for its groundbreaking AI technologies, is facing a series of lawsuits from figures like George R.R. Martin and Sarah Silverman, primarily over copyright issues. The latest to join this legal battle is The New York Times (NYT), targeting both Microsoft’s Copilot and OpenAI’s ChatGPT.

What is ChatGPT?

But there’s a common misconception here. ChatGPT isn’t exactly GPT-4, nor is Microsoft’s Copilot. Both are applications built on the foundation of GPT-4, a Large Language Model (LLM). LLMs operate by predicting the next most likely sequence of characters in a given input. For example, if you type “Grandma baked me some chocolate chip”, an LLM like GPT-4 might suggest ” cookies”, completing the thought. Notice the space in ” cookies”. That is no accident as a space is the most likely character to follow “… chip” in our example prompt.

In contrast, AI chatbots like ChatGPT offer a more interactive experience, engaging in conversations and responding to a wide range of queries. This distinction is crucial in understanding the basis of the NYT’s lawsuit. They claim that these AI tools can replicate news stories, citing instances where GPT-4 produced outputs similar to actual NYT articles. However, this comparison is misleading as since there are many hoops the NYT had to go through to obtain the responses they got. The following screencaps demonstrates the difference between an LLM and a chatbot.

Sample prompt and response from the open source LLM Bloom, which boasts a whopping 175 billion parameters. Prompt is in white, and the response is in blue.
Output from ChapGPT. Notice how it is engaging in conversation and not merely finishing your sentence.

The difference is clear. If asked for a specific quote ChatGPT will give it to you, but it will not give you an entire article. It simply wouldn’t. The best evidence the NYT can present is where ChatGPT presented an NYT story one paragraph at a time, which is what the NYT instructed it to do. This method of obtaining the news is neither efficient nor enjoyable.

The NYT’s case also hinges on specific examples where they prompted the LLM (not the AI) in a specific way. The nature of these prompts, which are crucial to the AI’s responses, hasn’t been fully disclosed. And it is doubtful that the responses can be replicated as both ChatGPT’s and GPT-4’s outputs are always different, even when presented with the same prompt.

Follow the money

So why is the NYT pursuing this lawsuit? There are already many ways to bypass the paywall that the NYT does not seem too keen on stopping. While it is speculative to assume their motives, it would not be unreasonable to suspect that because AI is hot, the NYT just wants in. Even if they have to sue their way in.