Standalone flagship build · AI Systems Builder
Eval-Driven RAG for Technical Documents
Built an evaluation-first RAG system for dense technical documents, using a labeled benchmark to separate retrieval quality from grounded answer quality before optimizing generation.