Synthetic media evidence workflow

How to preserve suspected manipulated media and its full distribution context — provenance signals, detector outputs treated as signals rather than verdicts, and uncertainty stated plainly — without relying on detection claims.

The race condition

Synthetic-media incidents are a race between distribution and preservation. A manipulated clip can be reposted across platforms in minutes, while removal — whether through platform enforcement or rapid takedown processes — can erase the original before anyone preserved it. Both sides of that race destroy evidence: spread scatters the trail, removal deletes the source. The workflow below exists to win the only part of the race you control: capturing the record before either happens.

Preserve the object and its path

Save the media itself where lawful and appropriate, but also preserve the route by which it appeared: source page, reposts, captions, thumbnails, comments, and visible account context. In many matters, the distribution path turns out to matter as much as the object — it shows intent, coordination, reach, and the moment the material entered circulation.

Original URL or earliest observed URL, with capture timestamp
The media file itself, hashed at capture, where lawful to hold
Captions, alt text, and surrounding post text
Repost and mirror trail across platforms, each with its own capture
Account context for the original publisher and significant spreaders
Visible platform labels (e.g. manipulated-media notices) as they appeared

Record provenance signals

Some media carries verifiable provenance metadata — content credentials, embedded capture information, or platform-applied markers. Record what is present and, just as importantly, what is absent. Absence of provenance data is not proof of manipulation, and presence is not proof of authenticity; both are signals to be documented for review.

Content credentials / C2PA data, if present, exported as found
File metadata (EXIF and equivalents) with the extraction method noted
Upload context: when and where the file first appeared as far as observable
Mismatches worth flagging: claimed date vs. metadata date, claimed source vs. first observed source

Treat detector outputs as signals, not verdicts

Automated analysis can be documented as a signal: which tool, which version, which score, retrieved when. What the evidence workflow must never do is let a detector decide the question. Reports should say "indicators consistent with manipulation were flagged by X" — never "this is fake". The legal and factual conclusion belongs to qualified reviewers and counsel, working from a record that states its own uncertainty honestly.

Tool name, version, and date for every automated check
Raw output preserved alongside any summarized score
Conflicting outputs recorded, not resolved by deletion
Human review notes kept as a separate, attributed layer

Handle intimate-image material with discipline

Where suspected synthetic material is intimate or sexual in nature, handling rules come before workflow efficiency. Material of this category should be ingested and held only with proper authorization — through counsel or an authorized representative — with that authorization recorded in the custody log. Where rapid platform-removal processes are in play, the order of operations matters: the preservation record should exist before removal requests erase the source. The affected person should never have to maintain the capture record themselves.

Prepare review-ready context

The goal is a compact record that helps a qualified reviewer understand what was captured, what is uncertain, what spread where, and what related material may matter next. A strong synthetic-media file reads like a careful investigation summary with its receipts attached — not like an accusation with images.