By Courtney Rozen and Jody Godoy
WASHINGTON, May 5 (Reuters) – The Trump administration on Tuesday announced it had expanded a program to give U.S. government scientists access to unreleased artificial intelligence models to conduct risk assessments to include Google’s DeepMind, xAI and Microsoft.
ChatGPT maker OpenAI and Claude owner Anthropic had already been voluntarily working with the U.S. Center for AI Standards and Innovation, the team of U.S. government scientists, to test unreleased models for vulnerabilities, according to the companies.
Here is what we know about the reviews:
WHAT RISKS ARE THE U.S. FOCUSED ON?
U.S. government scientists are focused on “demonstrable risks,” such as the risk that advanced models can be used to launch cyberattacks on American infrastructure, according to the CAISI website. They want to limit opportunities for U.S. adversaries to use AI to develop chemical or biological weapons, or corrupt the data used to train American AI models.
WHAT WILL COMPANIES HAND OVER?
OpenAI is working with the group to test GPT-5.5-Cyber, said Chris Lehane, head of global affairs at OpenAI, in a LinkedIn post on Tuesday. GPT-5.5-Cyber is a variant of its latest model designed for defensive cybersecurity work.
Microsoft will work with the scientists to build shared datasets and workflows to assess advanced AI models, the company said in a statement. Microsoft did not specify which models.
Anthropic gave CAISI access to both publicly available and unreleased models, allowing researchers to probe for vulnerabilities in a process known as “red-teaming,” or simulating the behavior of malicious actors, the company said in September. The company also gave CAISI detailed documentation on known vulnerabilities and safety mechanisms.
Google DeepMind, Alphabet’s AI research arm, will provide access to its “proprietary models” and data, a spokesperson said.
xAI did not immediately respond to a request for comment from Reuters.
WHAT HAS THE U.S. FOUND SO FAR?
Anthropic’s work with CAISI revealed that tricks such as claiming that human review had occurred, or substituting characters, could get around safety mechanisms, the company said, adding that it had patched the vulnerabilities.
OpenAI said in September that it worked with CAISI to probe vulnerabilities in its ChatGPT Agent that could have allowed sophisticated actors to bypass OpenAI’s cybersecurity measures. The exploit would have allowed the attacker to “remotely control the computer systems the agent could access for that session and successfully impersonate the user for other websites they’d logged into,” the company said.
The companies, along with Meta, Amazon and Inflection AI, agreed in 2023 to allow independent experts to check their models for biosecurity and cybersecurity risks.
The U.S. government scientists, organized under a different name during former U.S. President Joe Biden’s tenure, also released voluntary guidelines to protect against the risk of AI models leaking private health information or producing incorrect answers.
The scientists are now working on guidelines for critical infrastructure providers, such as the communications and emergency services sectors, to test their AI systems, according to their website.
(Reporting by Courtney Rozen; Editing by Stephen Coates)
