When AI Gets Uncomfortable

Google, Microsoft, and Elon Musk's xAI have agreed to hand the US government early access to their latest AI models before public release. The body doing the reviewing is CAISI — the Center for AI Standards and Innovation, operating under the Commerce Department alongside the National Institute of Standards and Technology. OpenAI and Anthropic had already signed similar agreements. The party line from Washington is that this is voluntary cooperation. It isn't. Not really.

Here's the part that deserves a moment of silence: Donald Trump, shortly after taking office, revoked Biden's AI executive order — the one that required developers to share safety test results with the government. Too much red tape. Stifling innovation. Government overreach. You know the script. That was then. Now, according to the New York Times, the same administration is drafting a new executive order that would mandate pre-release reviews for the most powerful AI models. The wheel has come full circle, and nobody seems particularly embarrassed about it.

The Inconvenience of Consequences

What changed? Anthropic's upcoming model, internally referred to as "Mythos," apparently spooked some people in the cybersecurity community. The concern isn't theoretical: a model with sufficiently advanced coding capabilities could meaningfully lower the barrier to attacking critical digital infrastructure. When the threat stops being abstract — when someone in a briefing room has to put a name and a capability level on it — ideological purity tends to soften fast.

There's a telling detail buried in the reporting. For these government evaluations, developers are permitted — encouraged, even — to submit versions of their models with safety guardrails deliberately reduced or removed. The reasoning is straightforward: you want to see what the system is actually capable of, not what it's been trained to admit to. That logic is sound. But it's also a remarkable thing to normalize: the official process for understanding AI safety involves building a less safe version of the AI and handing it to a government agency.

Regulation was apparently fine all along — just not when it was someone else's idea.

40 Evaluations and Counting

CAISI has already completed more than 40 model assessments, including some on systems that haven't been released publicly. Whether any of those reviews led to meaningful changes in what got shipped is unclear. Whether this new framework will have real teeth, or whether it's a face-saving exercise that lets the administration say it's doing something while the labs continue more or less as before — that's also unclear.

What is clear is the shape of the logic at work. AI is a national security asset, a competitive weapon, an economic engine — until it becomes a liability. Then it needs oversight. The framing shifts depending on what's politically useful at the moment, and the companies involved are savvy enough to play along. Calling it "voluntary cooperation" costs nothing. It keeps the regulators friendly and the headlines manageable.

The Wrong Hands on the Wheel

Here's the thing: AI regulation is necessary. Not as a reluctant concession, but as a genuine requirement. Models capable of compromising critical infrastructure, generating disinformation at scale, or making consequential decisions about people's lives cannot be left entirely to the discretion of the companies building them. The question was never whether to regulate. It was always who should.

And that's where the current approach falls short — not in intent, but in architecture. When guidelines are set by an administration, they reflect that administration's priorities. They can be revoked on day one of the next presidency. They bend toward national competitiveness, electoral calculus, and whatever spooked a senator this week. That's not a framework for governing a technology with long-term, civilization-scale implications. It's crisis management with a letterhead.

What's actually needed is something more like what science has occasionally managed to build for itself: independent expert bodies, insulated from short-term political pressure, tasked with developing ethical guidelines and safety standards over time. Not elected, not appointed on partisan lines, not answerable to quarterly results. The model isn't perfect — the IPCC has its critics, central banks have their blind spots — but the principle is sound.

Crucially, "expert" here cannot mean only AI researchers and engineers. The questions that matter most about these systems are not purely technical ones. They are questions about values: what decisions should be delegated to machines, whose interests are weighted and how, what kinds of harm are acceptable trade-offs for what kinds of benefit. Those questions belong to ethicists, sociologists, legal scholars, and social scientists as much as they belong to anyone who can read a model card. A governance framework staffed exclusively by technologists will optimize for what technologists know how to measure. That is precisely the problem, not the solution.

The issue isn't that governments want to look at these models. It's that the looking is being done by people whose primary incentive is not to get it right, but to look like they did.

Meanwhile, Across the Atlantic

Europe tried a different approach. The EU AI Act, which entered into force in August 2024, is a risk-based framework built on the idea that rules should be stable, comprehensive, and independent of whichever government happens to be in power. Rather than reactive executive orders, it establishes tiered obligations — the more consequential the system, the stricter the requirements. Prohibited practices came into effect in February 2025. Obligations for general-purpose AI models followed in August 2025. Full applicability was supposed to arrive in August 2026.

Germany moved quickly to implement it. In February 2026, the federal cabinet approved the KI-MIG — the national law putting the AI Act into German practice. The Bundesnetzagentur, Germany's Federal Network Agency, is designated as the central supervisory authority, supported by an independent internal chamber for AI market surveillance. It's not a perfect model, but structurally it's closer to the right idea: a standing, technically competent authority with a clear mandate, not a political appointee serving a four-year term.

And then the familiar forces showed up. Just last week — on May 7, 2026 — the EU Council and Parliament reached a political agreement on what's being called the "AI Omnibus," a package of amendments that delays key obligations for high-risk AI systems until December 2027 or even August 2028. The justification is streamlining and innovation. Around 46 major German companies had been lobbying for a two-year postponement. At Hannover Messe in April, Chancellor Merz echoed their concerns, calling for more regulatory flexibility for industrial AI applications. The language is different from Washington's, but the logic is the same: the rules are fine in principle, just not quite yet, and not quite this strictly.

Europe built a better architecture. It's discovering that architecture alone doesn't protect against political gravity.

CAISI reviewing models before release is better than nothing. The EU AI Act is better than executive orders. The Bundesnetzagentur is better than a White House memo. But "better" isn't the same as "sufficient." The underlying question — who gets to decide what a sufficiently dangerous AI looks like, and what happens if they decide wrong — remains unresolved on both sides of the Atlantic. Independent expert bodies with genuine authority, long mandates, and insulation from the lobbying cycle aren't a luxury. They're the only version of this that has any chance of outlasting the next election — or the next crisis.