Our Mission

UAP Research Has a Data Infrastructure Problem. We're Fixing It.

The information exists — scattered across dozens of siloed databases, government archives, civilian platforms, and sensor networks. No unified, machine-readable system connects them. Until now.

Based on findings from the 2025 AARO-Sponsored UAP Data Workshop
The Problem

A Fractured Data Landscape

UAP data exists across a patchwork of disconnected sources. None of them talk to each other. There is no common schema, no shared identifier system, no way to cross-reference data without manual research across five or more platforms.

NUFORC

80,000+ reports stored as flat HTML

MUFON

Paywall, proprietary formatting

AARO

Mixed classified / unclassified

FAA / ASRS

Not cross-referenced to UAP

Blue Book

Legacy archives, needs OCR

Social Media

Restricted APIs, unstructured

Sensor Data

Radar, ADS-B, satellite, seismic

Intl. Gov't

UK, France, Brazil, Chile, Australia

— NO INTEROPERABILITY — NO COMMON SCHEMA — NO CROSS-REFERENCING —
01

Data Fragmentation

Sighting data is scattered across incompatible databases with no shared identifiers. Cross-referencing a civilian sighting with concurrent radar data, flight paths, and weather requires manual research across 5+ platforms.

02

No Standardized Metadata

Location formats vary wildly — city names, zip codes, lat/long, free text. Timestamps are inconsistent or missing. No universal taxonomy for morphology, behavior, or sensor type. Comparative analysis at scale is impossible.

03

Accessibility Barriers

Classification locks away military data. Social platforms restrict API access. Historical records need OCR processing. Stigma suppresses reporting at the source. Critical records have been lost due to weak retention policies.

04

No Credibility Framework

No systematic method for assessing report quality. Human perception is fallible, sensor reliability varies, and there are no "gold standard exemplars." High-signal cases are buried alongside noise and misidentifications.

05

AI Without Infrastructure

AI tools require structured, clean, well-labeled data to function. The current state — sparse, unstructured, inconsistent — means AI cannot be deployed effectively. "Garbage in, garbage out" is the default state.

Fragmented Unified

The Solution

UAPAI: The Unified Open Infrastructure for UAP Data

A full-stack platform combining a public REST API, automated data ingestion, AI-powered analysis, and an interactive explorer — purpose-built to unify the global UAP data landscape.

Public REST API
Authenticated API with free and premium tiers. Standardized JSON responses. Filterable by date, location, source, credibility score, object type, and dozens of metadata fields.
Ingestion Engine
Automated pipelines that scrape, normalize, and ingest data from NUFORC, MUFON, AARO releases, FAA/ASRS, The Black Vault FOIA archives, and international sources. Every record mapped to a unified schema with full provenance tracking.
Unified Schema
Composable metadata standard: geospatial coordinates (normalized lat/long), ISO timestamps, morphology taxonomy, behavior classification, sensor type, witness background, provenance chain, and cross-reference identifiers.
AI Analysis
LLM-powered natural language queries, automated credibility scoring via multi-factor corroboration, semantic clustering, anomaly detection, and pattern recognition across historical and contemporary reports.
Explorer
Web-based frontend with map visualization, timeline scrubbing, advanced filtering, and drill-down capability. Designed for researchers, journalists, and the public to explore the dataset without writing code.
AARO Workshop Alignment

Direct Implementation of the 2025 AARO Workshop Recommendations

The workshop concluded with eight actionable recommendations. UAPAI addresses each one.

"Develop standardized metadata templates for UAP reports across all sources."
Unified schema with normalized geospatial coordinates, ISO timestamps, morphology taxonomy, sensor classification, and provenance tracking. Cross-walks between NUFORC, MUFON, and AARO schemas built into ingestion.
"Adopt a hybrid approach where qualitative expertise and human oversight complement AI."
AI handles triage, clustering, and credibility scoring. Every automated assessment includes confidence intervals and is designed for human review. No AI output is presented as definitive.
"Create systems to triage reports and assess credibility systematically."
Automated credibility scoring evaluates reports based on corroborating data streams, witness multiplicity, detail richness, internal consistency, and known misidentification patterns.
"Continue to preserve and digitize historical reports from legacy archives."
Ingestion pipelines include Project Blue Book, The Black Vault FOIA documents, and international government disclosures. OCR and AI-assisted extraction convert legacy records into the unified schema.
Credibility Framework

Multi-Factor Quality Assessment

Every report is scored, categorized, and preserved — never discarded. Low-credibility reports remain available for researchers who may find value in aggregate patterns.

Corroboration

Multiple witnesses, concurrent sensors, overlapping reports from different platforms

Detail Richness

Specificity of location, timing, morphology, behavioral observations

Consistency

Logical coherence of narrative, internal contradictions flagged

Misid Filtering

Cross-reference against satellite passes, aircraft routes, astronomical events

🔗

Provenance

Full chain of custody from original report to database entry

AI Integration

Responsible AI, Not Black-Box AI

The AARO workshop identified AI as both the greatest opportunity and greatest risk. Our approach maximizes the former while mitigating the latter.

What AI Does

  • Transcription and extraction from unstructured text, PDFs, and legacy documents
  • Triage and classification — flagging likely conventional explanations to surface genuine anomalies
  • Semantic search across the full corpus via natural language queries
  • Pattern detection: geographic clustering, temporal correlation, morphology grouping
  • Automated multi-factor credibility scoring with confidence intervals

What AI Does Not Do

  • Make definitive identifications about the nature of any sighting
  • Operate without confidence intervals and uncertainty flagging
  • Replace human expert review for high-signal cases
  • Train on the UAP dataset in ways that amplify cultural biases
Why Now

Converging Forces

Several developments make this the critical moment for UAP data infrastructure.

Legislative Momentum

The UAP Disclosure Act and congressional transparency mandates are creating unprecedented government openness. Data infrastructure must exist to receive and organize what gets disclosed.

Scientific Legitimacy

The AARO workshop, NASA's UAP study, and Harvard's Galileo Project represent a shift toward institutional scientific engagement. These efforts need data infrastructure to produce reproducible results.

Public Demand

UAP-related content generates billions of impressions annually. There is massive public interest but no authoritative, structured data source. UAPAI fills that vacuum.

AI Readiness

Large language models and multimodal AI are now capable enough to process UAP data at scale — but only if the underlying data is structured, clean, and accessible. UAPAI creates that foundation.

The Data Exists. The Methods Exist. The Infrastructure Is Here.

UAPAI is the connective tissue that transforms UAP research from a fragmented collection of incompatible databases into a unified, queryable platform.

Get API Access