AI prompt injection is no longer limited to screens or chat windows. New research shows that robots can be quietly redirected by something as ordinary as a printed sign placed in their environment. The twist is that no hacking is required, and the robot’s software remains untouched.
Researchers recently demonstrated that physical text, such as posters, labels, or warning signs, can interfere with how autonomous systems interpret their surroundings. A message that looks harmless to a human can end up overriding a robot’s original task once it passes through a vision language model.
Unlike traditional cyberattacks, this approach treats the real world as an input channel. The attacker does not need access to code, sensors, or internal controls. All it takes is placing readable text where a robot’s camera is likely to notice it.
In controlled simulations, the technique proved surprisingly effective. Autonomous driving systems were misdirected in more than 80 percent of test cases, while drone emergency landing scenarios failed at a rate of over 60 percent. Physical experiments told a similar story. In trials involving a small robotic vehicle, printed prompts interfered with navigation in nearly nine out of ten attempts, even when lighting and viewing angles changed.
When written words turn into actions
The method behind the research is known as CHAI, short for a command hijacking approach that targets how vision language models plan actions. Before a robot moves, it typically generates an internal instruction describing what it believes should happen next. If that intermediate step is influenced by misleading text, the system can confidently execute the wrong behavior.
This makes the issue especially concerning. There is no malware involved and no system breach. The robot follows its own rules, but those rules are shaped by how it interprets what it sees.
The threat model assumes a very limited attacker. The person placing the sign has no insider access and no technical privileges. Their only capability is the ability to put text within the camera’s field of view.
Designed to work across environments
CHAI is not limited to the wording of a message. The researchers also optimized visual details like font size, color, contrast, and placement. These factors influence how easily a vision language model can read and prioritize the text, even if nearby humans barely notice it.
The study also found that certain prompts remain effective across different scenes and models. Some so called universal prompts succeeded on images the system had never seen before, with average success rates above 50 percent and peaks exceeding 70 percent in one GPT based configuration. Multilingual prompts added another layer of risk, as messages written in Chinese, Spanish, or mixed languages were still effective while being easier for humans to overlook.
More details about the technical findings are available via the research paper hosted on arXiv and a summary published by TechXplore.
Rethinking safety for autonomous systems
The researchers outline several defensive strategies. One option is detection, where systems actively scan for suspicious text in images or within their own planning outputs. Another approach focuses on alignment, training models to be more skeptical of environmental writing that looks like an instruction. A third path involves deeper robustness research aimed at reducing how easily text can influence control decisions in the first place.
A practical near term safeguard is to treat all perceived text as untrusted input. Under this model, written information would need to pass explicit mission and safety checks before it could influence navigation or movement.
As robots increasingly rely on vision language models to understand the world, the assumption that signs always tell the truth becomes a liability. If a robot can read, it also needs to question what it reads. The research is scheduled for presentation at SaTML 2026, where these risks and proposed defenses are expected to receive wider scrutiny.








