AlphaForge
Challenge
Systematic equity strategies promise data-driven alpha, but the gap between a backtested signal and a trustworthy trading system is vast. Most quantitative backtests suffer from overfitting, survivorship bias, and unrealistic execution cost assumptions — producing impressive historical returns that evaporate in live trading.
Solution
Designed and built a production-grade Python system from a formal specification, implementing a 9-stage pipeline: universe selection across 5 global equity regions (~900 stocks), point-in-time data ingestion, feature engineering with Harvey-Liu-Zhu selection gates (t > 3.5), target construction, 5 ML models stacked via Ridge meta-learner, CVXPY portfolio optimization with Black-Litterman views and Ledoit-Wolf covariance shrinkage, multi-layered risk management with 6 concurrent monitors, and walk-forward validation with quarterly expanding windows and 2-quarter purge gaps. Four anti-overfitting gates — Deflated Sharpe Ratio, Probability of Backtest Overfitting (CSCV), adversarial validation, and complexity budgeting — ensure reported performance reflects genuine statistical edge.
Result
Walk-forward backtesting across quarterly expanding windows produced a Sharpe ratio of 1.54 with maximum drawdown of -2.3% through COVID-2020. The high Sharpe reflects extremely low portfolio volatility from factor-neutral construction and institutional risk management. All quality gates (Deflated Sharpe Ratio, adversarial validation, factor exposure, sub-period stability) passed. Every data access is strictly point-in-time with publication-lag registries per region, eliminating look-ahead and survivorship bias by construction. Covers US, EU, UK, Hong Kong, and Swiss markets with realistic Almgren-Chriss transaction cost modeling, conformal prediction position sizing, and Half-Kelly risk budgeting.
Key Highlights
Spec-First Engineering at Scale
Complete formal specification defining every pipeline stage, risk constraint, and validation gate before a single line of code — every architectural decision traces back to a formal requirement.
5-Model ML Ensemble
XGBoost, LightGBM, CatBoost, LightGBM-LambdaRank, Ridge stacked via Ridge meta-learner retrained quarterly on expanding out-of-sample predictions.
Anti-Overfitting Gates
Deflated Sharpe Ratio, Probability of Backtest Overfitting via CSCV, adversarial validation (AUC < 0.60), and complexity budget (incremental IC ≥ 0.005 on locked holdout).
Institutional Portfolio Construction
CVXPY optimization (Clarabel → SCS fallback), Black-Litterman views, hybrid covariance (FF5 + PCA + IEWMA), Ledoit-Wolf shrinkage, Half-Kelly sizing with conformal prediction intervals.
Global Multi-Region Coverage
5 regions (US, EU, UK, Hong Kong, SIX) with ~900 stocks. Per-region publication lag registries, exchange calendars, and regulatory constraints.
Real-Time Risk Monitoring
6 concurrent monitors: turbulence, absorption ratio, crowding, SNB intervention, correlation regime, and short-selling. Almgren-Chriss square-root impact cost model.
Walk-Forward Validation
Quarterly expanding windows, 2-quarter purge gap, HLZ multiple testing (t > 3.5), triple data split (40/45/15), bootstrap stability ≥ 0.90.
