Battle-first benchmark

Evaluate egocentric data the way LM Arena evaluated models.

EgoArena ranks dataset variants by pairwise preference. It starts from your internal industrial egocentric seed set, then layers in sampled open-source baselines like Egodex and EgoVerse as battle-ready cohorts.

Start ratingFull leaderboard

Seed corpus

EgoArena Seed Industrial Manipulation

Seed corpus imported from the three most recent Downloads ZIP archives. Industrial egocentric manipulation footage with step-level JSON annotations.

Contributor auth

GitHub-backed sessions

Mirror the RunRobot login flow in production while keeping a local-dev auth path for localhost QA.

Public previews

Redacted battle clips only

Originals remain private in S3. The public site only serves muted, preview-length clips and posters.

1 datasets

19 videos

3K steps

Quality signals

What v0 scores today

Technical quality, annotation structure, and manipulation signal are visible diagnostics. The public rank still comes from anonymous pairwise battles between redacted clips, not from a handcrafted composite score.

Technical quality
78.8
Annotation quality
100.0
Hand signal
36.8
Motion density
81.1

Live arena

Variant leaderboard

Full leaderboard