Blog
The Problem With Labels Is Everything Before the Labels
Four problems stacked on top of each other. Label design, subjectivity, resolution, and redundancy. What survived was 16 labels - and a clearer understanding of what unstructured data can actually add.
315,000 Properties Lost
Three bugs stacked on top of each other. Each one only visible after the previous was fixed. Here is what happened and what bulletproof checkpointing actually looks like.
What Happens When You Run an LLM on 1.5 Million Texts
Chinese hallucinations, 20+ artifact patterns, and the six-layer fallback system I built to get 1.4M clean translations.
VLMs Are Confident Liars
The model invented features that weren't there, miscounted visible elements, and described adjacent objects as part of the subject. The fix wasn't a better model - it was a better prompt.
The Role Without a Name
What happens when AI writes all the code and the hard skill becomes knowing what to build.
Our 68-Hour Estimate Took 7 Days
Why notebook benchmarks lie about production performance, and where the bottleneck actually was.