Why blocking China’s DeepSeek from using US AI may be difficult

By Stephen Nellis, Krystal Hu, Jeffrey Dastin, Anna Tong and Katie Paul

(Reuters) – Top White House advisers this week expressed alarm that China’s DeepSeek may have benefited from a method that allegedly piggybacks off the advances of U.S. rivals called “distillation.”

The technique, which involves one AI system learning from another AI system, may be difficult to stop, according to executive and investor sources in Silicon Valley.

DeepSeek this month rocked the technology sector with a new AI model that appeared to rival the capabilities of U.S. giants like OpenAI, but at much lower cost. And the China-based company gave away the code for free.

Some technologists believe that DeepSeek’s model may have learned from U.S. models to make some of its gains. The distillation technique involves having an older, more established and powerful AI model evaluate the quality of the answers coming out of a newer model, effectively transferring the older model’s learnings.

That means the newer model can reap the benefits of the massive investments of time and computing power that went into building the initial model without the associated costs.

This form of distillation, which is different from how most academic researchers previously used the word, is a common technique used in the AI field. However, it is a violation of the terms of service of some prominent models put out by U.S. tech companies in recent years, including OpenAI.

The ChatGPT maker said that it knows of groups in China actively working to replicate U.S. AI models via distillation and is reviewing whether or not DeepSeek may have distilled its models inappropriately, a spokesperson told Reuters.

Naveen Rao, vice president of AI at San Francisco-based Databricks, which does not use the technique when terms of service prohibit it, said that learning from rivals is “par for the course” in the AI industry. Rao likened this to how automakers will buy and then examine one another’s engines.

“To be completely fair, this happens in every scenario. Competition is a real thing, and when it’s extractable information, you’re going to extract it and try to get a win,” Rao said. “We all try to be good citizens, but we’re all competing at the same time.”

Howard Lutnick, President Donald Trump’s nominee for Secretary of Commerce who would oversee future export controls on AI technology, told the U.S. Senate during a confirmation hearing on Wednesday that it appeared DeepSeek had misappropriated U.S. AI technology and vowed to impose restrictions.

“I do not believe that DeepSeek was done all above board. That’s nonsense,” Lutnick said. “I’m going to be rigorous in our pursuit of restrictions and enforcing those restrictions to keep us in the lead.”

David Sacks, the White House’s AI and crypto czar, also raised concerns about DeepSeek distillation in a Fox News interview on Tuesday.

DeepSeek did not immediately answer a request for comment on the allegations.

OpenAI added it will work with the U.S. government to protect U.S. technology, though it did not detail how.

“As the leading builder of AI, we engage in countermeasures to protect our IP, including a careful process for which frontier capabilities to include in released models,” the company said in a statement.

The most recent round of concern in Washington about China’s use of U.S. products to advance its tech sector is similar to previous concerns about the semiconductor industry, where the U.S. has imposed restrictions on what chips and manufacturing tools can be shipped to China and is examining restricting work on certain open technologies.

NEEDLE IN A HAYSTACK

Technologists said blocking distillation may be harder than it looks.

One of DeepSeek’s innovations was showing that a relatively small number of data samples – fewer than one million – from a larger, more capable model could drastically improve the capabilities of a smaller model.

When popular products like ChatGPT have hundreds of millions of users, such small amounts of traffic could be hard to detect – and some models, such as Meta Platforms’ Llama and French startup Mistral’s offerings, can be downloaded freely and used in private data centers, meaning violations of their terms of service may be hard to spot.

“It’s impossible to stop model distillation when you have open-source models like Mistral and Llama. They are available to everybody. They can also find OpenAI’s model somewhere through customers,” said Umesh Padval, managing director at Thomvest Ventures.

The license for Meta’s Llama model requires those using it for distillation to disclose that practice, a Meta spokesperson told Reuters.

DeepSeek in a paper did disclose using Llama for some distilled versions of the models it released this month, but did not address whether it had ever used Meta’s model earlier in the process. The Meta spokesperson declined to say whether the company believed DeepSeek had violated its terms of service.

One source familiar with the thinking at a major AI lab said the only way to stop firms like DeepSeek from distilling U.S. models would be stringent know-your-customer requirements similar to how financial companies identify with whom they do business.

But nothing like that is set in stone, the source said. The administration of former President Joe Biden had put forth such requirements, which President Donald Trump may not embrace.

The White House did not immediately respond to a request for comment.

Jonathan Ross, chief executive of Groq, an AI computing company that hosts AI models in its cloud, has taken the step of blocking all Chinese IP addresses from accessing its cloud to block Chinese firms from allegedly piggybacking off the AI models it hosts.

“That’s not sufficient, because people can find ways to get around it,” Ross said. “We have ideas that would allow us to prevent that, and it’s going to be a cat and mouse game … I don’t know what the solution is. If anyone comes up with it, let us know, and we’ll implement it.”    

(Reporting by Stephen Nellis, Anna Tong and Jeffrey Dastin in San Franisco; Krystal Hu and Katie Paul in New York and David Shepardson in Washington, D.C.; editing by Kenneth Li and Nick Zieminski)