Vishnu Datta Jayanti.

CS @ Purdue (Honors), building ML systems with real-world applications.

// 01 . log

education

B.S. Computer Science, Purdue University, John Martinson Honors College. Expected May 2030.

focus

ML research, applied ML, and quant backtesting. Focused on building predictive models and systematic strategies grounded in real-world data.

approach

I test things on real data before I trust them. Six projects so far, each one backtested, benchmarked, or peer reviewed in some form.

now

Working on volatility-based trading strategies and looking for research or internship opportunities in ML, quant, and SWE

// 02 . research

Publications

May 2026 Wharton Sports
Analytics Journal

Beyond the Expert: An Algorithmic Approach to Correcting Expert Bias in Fantasy Football Projections

Original research evaluating whether machine learning can mitigate cognitive bias in fantasy football projections. Position-specific XGBoost models were trained on ten years of NFL data with over 1,900 engineered features, improving ranking accuracy, especially for quarterbacks and tight ends, by reducing narrative-driven bias.

XGBoost 10yr NFL data 1,900+ features bias correction
view paper ↗

// 03 . builds

Projects

2026 Python · XGBoost · scikit-learn · Pandas

NFL Expert Bias Mitigation

The functional implementation of my published research: a tool that generates fantasy football rankings from position-specific XGBoost models and compares them directly against ESPN's expert projections to surface undervalued and overvalued players. Includes a full evaluation pipeline that scores both against final season results to see who was actually right.

4 modelsQB, RB, WR, TE
2013-2023historical dataset
1900+features utilized
view repo ↗
Apr – May 2026 Python · Pandas · NumPy

Volatility-Adjusted Momentum Engine

A point-in-time backtesting framework for U.S. equities. A dual-filter strategy identifies price breakouts via 50/200-day moving average spreads, then normalizes signal strength using Garman-Klass volatility estimators and allocates capital inversely to risk. Built as six pipeline stages, from raw ticker partitioning through a daily-rebalanced execution simulator with liquidity guardrails.

$70K → $212Ksimulated portfolio, 12yr backtest
2012–2024historical coverage
point-in-timeno look-ahead bias
view repo ↗
Oct 2024 – Aug 2025 Python · Flask · XGBoost · LangChain · NBA API

The Basketball Oracle

A deployed NBA analytics platform with live scores, standings, rosters, and boxscores, plus a machine-learning core that predicts player performance for the upcoming season. A toggle switches between fast stat lookups and an AI search mode: a LangChain and Ollama RAG pipeline that answers natural-language basketball questions over a Wikipedia-derived knowledge base.

0.71 R²avg across 5 NBA metrics
52,000+indexed articles (RAG)
KMeans + XGBoostprediction core
view repo ↗
Jun – Aug 2025 Python · NLTK · WordNet

Semantic Frame Identifier

An NLP system that extracts three semantic frames, capital stock, commercial transaction, and business, from financial text. Named entities are corrected with a custom gazetteer, frames are matched against lexical units using Wu-Palmer and Leacock-Chodorow WordNet similarity scoring, and frame elements like buyer, seller, and monetary amount are pulled from surrounding context.

3 framesstock, transaction, business
WordNetsimilarity scoring
customgazetteer + NER correction
view repo ↗
2025 Flask · Python · JavaScript · XGBoost

Differential Diagnosis

A full-stack web app that predicts likely diseases from user-reported symptoms, built with a small team. I led the project, trained the XGBoost classifier behind the prediction engine, and built the Flask backend along with a health-history tracker that lets logged-in users review past diagnoses over time, all behind secure, hashed authentication.

XGBoostsymptom → disease classifier
3-personteam, led by me
deployedlive on Render
Feb – Apr 2026 C

VaultCLI

A command-line password manager written in C from the ground up. Master-password authentication gates a vault encrypted with a repeating-key XOR cipher, stored in custom binary structs via raw file I/O. Includes duplicate detection, strength scoring across length and character variety, and randomized 20-character password generation.

custombinary storage format
XOR cipherbuilt from scratch
zeroexternal crypto libraries
view repo ↗

// 04 . stack

Languages / Tools

languages

  • Python
  • C
  • HTML
  • CSS

ml & data

  • XGBoost
  • Scikit-learn
  • Pandas
  • NumPy
  • NLTK

infra & tools

  • Flask
  • LangChain
  • Git / GitHub

// 05 . contact

Reach Out.

Open to research collaborations, internships, and interesting problems.