Answers that cite their sources

A language model will happily tell you anything. That confidence is exactly the problem when the subject matter is something people care about getting right.

For our conversational projects, we set a hard rule early: every answer must be traceable to a trusted source, or the assistant says it does not know. No exceptions, no graceful-sounding guesses.

The shape of the problem

Retrieval-augmented generation is the usual answer — fetch relevant passages, hand them to the model, ask it to answer from those passages. It works, until the question falls outside the corpus. Then the model quietly reverts to its training data and starts improvising.

That failure is invisible to the user. The answer looks just as fluent as a grounded one. Fluency is not truth.

Out-of-corpus fallback

Our approach adds an explicit gate. Before answering, a lightweight pass decides whether the corpus actually covers the question:

In corpus → answer strictly from retrieved passages, with citations.
Out of corpus → say so plainly, and offer what is covered nearby.

question → retrieve → coverage check
                         ├─ covered    → grounded answer + sources
                         └─ not covered → honest "outside what I know"

The coverage check is the interesting part. It is cheap, it runs before the expensive generation, and it turns a silent failure into an honest one.

Why “I don’t know” is a feature

Users trust a system that admits its limits far more than one that is confidently wrong once. Every honest refusal is a small deposit in a trust account you cannot refill after a single fabrication.

That is the whole philosophy in miniature: we would rather ship something that says less and means it.

Where this is going

This architecture underpins more than one project in the studio. As it stabilizes we will write about the pieces in detail — the coverage classifier, the citation format, the way we keep latency low when there are two model passes instead of one.

More soon.