The Cluster
at a glance
Sources → ingestion → storage → Spark compute → research → observability. See how the pieces stay deterministic and debuggable.
I design and operate a trading data platform
Tackling big data challenges with modern tools
How It Started
In 2021 I became curious about the stock market. Over a weekend I hacked together a small Ruby on Rails app, scraped historical prices from Yahoo! Finance, and stored them in a local postgres database. I started with a simple question: which S&P 500 stocks tend to move together, and which tend to move in opposite directions? I ran a simple analysis comparing returns across a universe of 50 tickers. It took 3 days to finish.
The results were interesting, but too slow to be useful. I wanted to refresh the analysis daily and include the entire market. Scaling to 10,000 tickers would amount to a naïve work factor increase of over 120,000×. My single threaded Rails job running on a spare laptop wasn't up to the task.
How It's Going
Since then I've built out a research and trading platform, from bare metal hardware to streaming Spark pipelines powering live trading algorithms. Modern OSS and consumer hardware make it possible to build powerful systems.
The Right Tool for the Job
I started on a single laptop, scraping public data and running a single-threaded Rails job. It was the framework I knew best from 10+ years of full stack web development. Unfortunately, it wasn't going to help me work with market data at scale.
I love Rails - it changed my life. The ORM and MVC architecture made sense to me and I was able to use it to solve the problems for myself and others. I had the same experience a second time when I spent a year learning Flutter. Suddenly mobile development was in my skill set.
Fortunately, lightning struck a third time for me. Learning Spark and having the power to solve problems I thought were out of my league was thrilling. Big data engineering is in my toolbelt now, and I am excited to see where it takes me next.
The Cluster
Sources → ingestion → storage → Spark compute → research → observability. See how the pieces stay deterministic and debuggable.
Employment
I've built this platform alongside full‑time software engineering roles, delivering features and supporting production systems while deepening my data engineering skills.
I enjoy partnering with data scientists and other engineers to stabilize and optimize existing stacks. In my next role, I’m looking to focus that experience on data engineering problems that matter to your business.