Thesis AI logoThesis AI/Docs

Data

Data Sources

Thesis AI integrates two primary external data providers: Massive API for market and company data, and FRED API for macroeconomic indicators. All agent analysis is grounded in data from these sources — no data is invented or interpolated.

Massive API

Massive is the primary market data vendor. It provides real-time and historical equity data across quotes, price bars, company fundamentals, news, and technical indicators. The Fundamentals Agent, News Agent, and Price & Trend Agent all draw from Massive.

Data TypeDescription
QuotesReal-time bid/ask, last price, volume, and change
OHLC BarsIntraday and daily candlestick data
FundamentalsP/E, P/B, ROE, debt ratios, operating margins, EPS
NewsHeadline feed with relevance scoring by symbol
TechnicalsRSI, MACD, moving averages, momentum indicators
Market MoversTop gainers, losers, and volume leaders
Dividends & SplitsCorporate actions history and upcoming events

Coverage

Massive API coverage is currently focused on US equities (NYSE, NASDAQ). Coverage of ETFs, international markets, and options is planned for future phases.

FRED API

The Federal Reserve Bank of St. Louis provides the FRED (Federal Reserve Economic Data) API, which is the authoritative source for US macroeconomic indicators. The Macro Agent uses a curated set of FRED series to construct its macro snapshot.

Series IDDescription
FEDFUNDSFederal funds effective rate
CPIAUCSLConsumer Price Index (all urban, all items)
UNRATEUS civilian unemployment rate
GDPReal gross domestic product (quarterly, annualized)
DGS22-Year Treasury constant maturity rate
DGS1010-Year Treasury constant maturity rate
T10Y2Y10-Year minus 2-Year Treasury spread (yield curve)

The macro snapshot is assembled each time the Macro Agent runs. A Celery background task refreshes the cached snapshot hourly during market hours to reduce per-request latency.

Caching Strategy

To minimize vendor API costs and ensure fast response times, Thesis uses a two-tier caching strategy backed by Redis:

Data TypeCache TTLNotes
Real-time quotes15 secondsRefreshed on each request if stale
OHLC bars (intraday)1 minuteStale bars served during off-market hours
News headlines5 minutesLonger TTL acceptable for narrative analysis
Fundamentals snapshot24 hoursFundamental ratios change infrequently
Macro snapshot (FRED)1 hourRefreshed by Celery beat scheduler

Data Integrity

All agent prompts explicitly constrain agents to use only the provided data snapshot. If a required data field is missing or stale beyond acceptable bounds, the agent notes the limitation in its output rather than extrapolating or inventing figures. This ensures every piece of analysis in a thesis is traceable to a real data point.