A multi-modal dataset of human reasoning — built from 7 years of educational video, enriched with transcripts, EEG signals, and facial analysis to create the richest thinking-data corpus ever assembled.
The Omega Quest dataset caught the attention of Andrej Karpathy — former Director of AI at Tesla, founding member of OpenAI, and one of the most influential voices in machine learning today. His recognition validated our thesis: that high-quality, multi-modal human reasoning data is the missing ingredient in the next generation of AI systems.
This moment marked a turning point for the project. Karpathy's endorsement brought visibility and confirmed that the direction we'd been pursuing — capturing real human thinking processes across multiple modalities — was exactly what the AI research community needed. The tweet sparked a wave of interest from researchers, builders, and data contributors worldwide.
The Uncertain Systems YouTube channel started over 7 years ago as a personal project — recording the process of working through quantum mechanics, physics, and mathematics problems on camera. Every video captures real, unscripted reasoning: hypotheses formed, mistakes made, intuitions tested. Hundreds of hours of genuine human thinking, preserved on video. This archive is the seed of the Omega Quest dataset.
Exploring intuition around the expectation value formula
Entanglement Simulator — Hypergraph to Wavefunction
Digging into expectation values... deeply
Exploring the position and momentum conversion — Basic flaw identified!
Each video in the archive goes through a multi-stage enrichment pipeline. Raw footage is transformed into a richly annotated, multi-modal data record that captures not just what was said, but how the person was thinking — their cognitive load, emotional state, attention patterns, and reasoning flow.
“Exploring intuition around the expectation value formula” — 40 minutes of a human working through quantum mechanics, thinking out loud.
Below is a real sample of what a single 30-second segment (timestamp 12:45–13:15) looks like after processing through all four data layers.
All four layers merge into a single training record. This is what gets fed to the model — a holistic snapshot of a human reasoning moment, annotated across every measurable dimension.
{
"segment_id": "us_qm_ev_042_seg_026",
"timestamp": {
"start": 765,
"end": 795
},
"video": {
"domain": "quantum_mechanics",
"topic": "expectation_value",
"difficulty": "advanced"
},
"transcript": {
"text": "So now I'm looking at this integral and thinking... if ψ is normalized, the expectation value should just be this inner product. But wait — that assumes the operator is Hermitian, right? Let me check...",
"reasoning_markers": [
"thinking...",
"But wait",
"Let me check",
"assumes"
],
"self_correction": true
},
"eeg": {
"cognitive_load": 0.84,
"alpha_delta": -0.62,
"beta_delta": 1.63,
"theta_delta": 0.33,
"event": "analytical_onset @ 12:52"
},
"facial": {
"engagement": 0.91,
"emotional_arc": [
"focused",
"surprised",
"concentrated",
"resolved"
],
"micro_expression": {
"type": "surprise",
"at": "12:52",
"duration_ms": 340
},
"brow_furrow": 0.72
},
"labels": {
"reasoning_type": "verification",
"cognitive_event": "assumption_check",
"quality_score": 0.94
}
}Current AI training data is mostly text. Our dataset adds neural signals, facial expressions, and audio prosody — the full picture of human cognition.
Not scripted, not edited. Every data point comes from genuine problem-solving moments — mistakes, corrections, and breakthroughs included.
Every modality is time-synced to the millisecond. The model can learn that a beta-spike and a brow-furrow coincide with a verbal self-correction.
Contribute your thinking data and help build the richest multi-modal reasoning corpus ever assembled.