Hallucinations in AI Assisted Police Reports

Robert Vargas, Gabe Rojas, David Hackett & Dan Sutton
Illustrations by Vero Martinez

Across the United States, Police Departments are using AI to help write reports. These AI-assisted police reports are generated in numerous ways from simply prompting ChatGPT as one ICE Agent did in Chicago¹ to more complex tools that use body worn camera video footage, or transcripts of them, to generate an incident report draft. Although most attention on these tools centers on whether or not they actually help save officer time with filing paperwork, they also raise concerns given LLMs' tendency to hallucinate. One famous example already involved Axon's DraftOne turning an officer into a frog in its report. More seriously, inaccuracies in police reports can cost someone their freedom. With these motivations, we designed a research study.

scroll

Our Research Question: What do AI assisted police report tools hallucinate about?

Studying hallucinations in technologies used by police is extremely challenging. A recent report² by the Electronic Frontier Foundation found that Axon's DraftOne didn't save the original drafts written by AI making it impossible to know if a report was written by AI or not. Given police and the tech sector's aversion to independent audits, it's unlikely that researchers or oversight groups will gain access to properly evaluate these technologies before they become widely adopted across the country. To overcome these barriers, the UChicago Justice Project brought together software engineers, social scientists, and legal scholars at Stanford University to create our own AI assisted Police Report writing tool.

It's called PARIS, the Police AI Report and Incident Simulator.

PARIS simulates the most common versions of AI Police Report writing tools based on publicly available information on how companies are building these tools. These are not exact replicas of tools like Axon's Draft One. Instead, we configured PARIS as conservatively as possible, setting the LLM's temperature to zero (which limits its creativity), and prompting it using language from how county prosecutors instruct officers to write reports objectively.

The first test case we used was body camera footage from a 2023 incident in Tallahassee Florida where Calvin Riley Sr was pulled over and arrested for a DUI. This video generated national attention because the officer found an unsealed vodka bottle, opened and poured out its contents, then threw the bottle back into the car. While Riley claimed the officer planted evidence and the officer claimed she was following department procedures, both sides agreed that the vodka bottle was unsealed. How would PARIS draft a police report from this body camera footage?

The bottle was not open

scroll to continue ↓

We ran the Calvin Riley video through PARIS 100 times, and it claimed the bottle was open in 71% of the AI-Generated Reports.

Seal Not
Described

26

Open
Container

21

50

Bottle
Omitted

3

01020304050

Number of Reports

ChatGPT (Transcript)

Gemini (Video)

AI Auditing tools like PARIS provide a way to see what AI tools sold to governments are capable of.

Interested in using PARIS? Apply Here.