Could OpenAI be violating YouTube's terms of service?


Mira Murati, CTO of OpenAI (left) and Neal Mohan, CEO of YouTube (right).
Patrick T. Fallon/AFP Mandel Ngan/AFP via Getty Images

  • Does OpenAI train its video generator Sora on YouTube content?
  • If so, it would be a violation of YouTube's terms of service, the CEO said.
  • But OpenAI's chief technology officer couldn't answer whether Sora scrapes YouTube content.

OpenAI should not use YouTube videos to train its artificial intelligence tools, YouTube CEO says.

But is it?

Mira Murati, OpenAI's chief technology officer, said she didn't know.

In an interview with The Wall Street Journal last month, Murati was asked whether Sora, OpenAI's text-to-video generator, was trained on video content from YouTube.

“I’m actually not sure about that,” Murati told the Journal.

YouTube CEO Neal Mohan told Bloomberg on Thursday that he also didn't know whether OpenAI was using YouTube content to train its video generator.

If Sora actually uses YouTube content, it would be a “clear violation” of the platform’s terms of service, Mohan said.

“From a creator’s perspective, when a creator uploads their hard work to our platform, there are certain expectations,” Mohan said Bloomberg's Emily Chang. “One of those expectations is that the terms of service will be followed. Downloading things like transcripts or video excerpts is not permitted, and that is a clear violation of our Terms of Service. These are the rules of Content-wise, we are on the rise on our platform.”

Mohan added that Google (which owns YouTube) uses some YouTube videos to train its own AI platform, Gemini, but only if individual creators on the platform have agreed to this in their contracts.

In response to Business Insider's request for comment, a YouTube spokesperson confirmed that the company's terms “prohibit the unauthorized scraping or downloading of YouTube content.”

OpenAI did not respond to Business Insider's request for comment.

The debate over what types of content tech companies use to train their AI models has picked up speed as the artificial intelligence industry explodes. And many artists and cultural workers are leading the way, arguing that their copyrighted works cannot be used without their permission.

OpenAI is no stranger to complaints about the data collection practices of its AI tools. Those who have sued the company for copyright infringement include comedian and author Sarah Silverman, whose lawsuit was partially dismissed, “Game of Thrones” author George RR Martin and The New York Times.

In February, OpenAI asked the judge overseeing the Times' lawsuit to dismiss all or part of four of the six counts the media outlet brought against the company, alleging that the Times paid someone to invest in its products Hack OpenAI.

And last summer, more than 8,000 authors wrote an open letter to AI leaders, including Sam Altman of OpenAI, demanding compensation for using their works to train AI tools without permission.

Axel Springer, the parent company of Business Insider, has signed a global deal that will allow OpenAI to train its models on the reporting of its media brands.