Back to Work

Formula 1 Race Prediction & AI Analytics Platform

2026
Client / Project
Formula 1 Prediction System
Year
2026
Industry
Motorsport Analytics / Artificial Intelligence
Tech Stack
Python, XGBoost, LightGBM, CatBoost, QLoRA, Hugging Face, Ollama, Pandas, BeautifulSoup

Key Features

Historical Race Data Pipeline

Custom web scraping infrastructure collects and maintains 30 years of Formula 1 race data with intelligent HTML caching to minimize redundant requests.

Advanced Feature Engineering

Generates 44 predictive features per driver per race, including rolling form metrics, championship standings, and circuit-specific performance statistics.

Multi-Model Prediction Engine

Combines six machine-learning approaches to predict race winners, podium finishers, and final race positions.

Data Leakage Prevention

Strict historical look-back methodology ensures models only use information available before each race weekend.

Local LLM Race Analysis

Integrates local large language models for race prediction and comparative AI reasoning without cloud dependencies.

Fine-Tuned Motorsport AI

A custom QLoRA-trained language model specializes in Formula 1 race analysis and prediction tasks.

Interactive Data Visualisation

Comprehensive analytics dashboards reveal long-term trends, driver performance, and circuit characteristics.

Live Race Prediction Workflow

End-to-end pipeline generates predictions for upcoming race weekends using freshly collected data.

Business Impact

High Prediction Accuracy

LightGBM achieved 87.5% top-three prediction accuracy during the 2024 season evaluation.

Reduced Infrastructure Costs

QLoRA fine-tuning produced a smaller specialized model that performs near larger models while requiring significantly less hardware.

Scalable Data Collection

Disk-based caching dramatically reduces scraping time and unnecessary network requests.

Explainable Predictions

Feature engineering and visualization tools provide transparency into prediction outcomes.

Rapid Model Experimentation

Multiple model architectures enable continuous evaluation and optimization.

Real-Time Race Forecasting

The platform supports live prediction generation for future race weekends using newly collected data.

Efficient Local Deployment

Entire AI pipeline runs on consumer-grade hardware equipped with a single RTX 4050 GPU.

Future Cloud Expansion

Architecture is prepared for GPU cloud deployment and API-based prediction services.

Ready to build something great?

Tell us your idea. We'll tell you how to make it real.

Start a Conversation