Sharp Retriever
Transparency
What the model is, what data it uses, and how the record is kept.
Model overview
An XGBoost classifier trained on Retrosheet game logs from 2015–2024. Separate models for moneyline (winner prediction) and totals (over/under run totals). The models are retrained when meaningful new signal is added. Training history is kept in version control.
The market line is not an input feature. The model is built on baseball signal only. This is a deliberate design choice: we believe durable performance comes from knowing something the market underweights, not from comparing prices.
Data sources
What we do NOT use
We do not use betting market prices, opening lines, line movement, public betting percentages, sharp/square money data, or any information derived from sportsbook markets as a model input. The model's output is independent of the market.
How picks are published
The pipeline runs six times per day between 11:30 AM and 10:00 PM UTC. Each run assembles the latest game-day features and scores today's games. Model picks are locked and published. Once published, a pick is never altered or removed — the side, the market, and the game date are immutable. Market odds shown on the card are the open-market price at publication time, not a projected line.
How the record is kept
Results are graded after game completion. Win / Loss / Push is determined by the published side against the final score. No retroactive exclusions, no sample trimming, no “all-time” numbers that omit difficult windows.
What we will not publish
No win-rate or ROI claims until we have a meaningful forward sample (minimum 200 graded picks from the current model version). No backdated backtests presented as live results. No simulated performance. If the model underperforms, the record will show it — positive or negative, the forward ledger is public.
Need help? Contact us — we respond within 24 hours.