planet money

The alleged theft at the heart of ChatGPT

When best-selling thriller writer Douglas Preston began playing around with OpenAI’s new chatbot, ChatGPT, he was, at first, impressed. But then he realized how much in-depth knowledge GPT had of the books he had written. When prompted, it supplied detailed plot summaries and descriptions of even minor characters. He was convinced it could only pull that off if it had read his books.

Large language models, the kind of artificial intelligence underlying programs like ChatGPT, do not come into the world fully formed. They first have to be trained on incredibly large amounts of text. Douglas Preston, and 16 other authors, including George R.R. Martin, Jodi Piccoult, and Jonathan Franzen, were convinced that their novels had been used to train GPT without their permission. So, in September, they sued OpenAI for copyright infringement.

This sort of thing seems to be happening a lot lately–one giant tech company or another “moves fast and breaks things,” exploring the edges of what might or might not be allowed without first asking permission. On today’s show, we try to make sense of what OpenAI allegedly did by training its AI on massive amounts of copyrighted material. Was that good? Was it bad? Was it legal?

Help support Planet Money and get bonus episodes by subscribing to Planet Money+ in Apple Podcasts or at plus.npr.org/planetmoney.