Skip to main content
Explainers · Training
A Visual Primer

Nobody wrote
the rules.

A model’s knowledge, manner and quirks all come from how it was raised. Seven stages.

Scroll
01

Learned, not programmed.

Nobody sat down and typed in the law of negligence. There is no file inside the model that lists the elements of a claim, no rule that says how to draft a chronology. The model learned everything the same way: by playing one game — guess the next word — across a vast amount of text.

Everything else follows from that. The knowledge, the manner, the gaps and the quirks are all consequences of how the model was trained, not decisions anyone made line by line.

How software usually works

Rules written by people

A programmer decides what happens in every case, and the machine does exactly that.

How models work

Patterns absorbed from text

Nobody writes the behaviour. It emerges from what the model read and how it was corrected.

02

Pretraining: the reading years.

Stage one is scale. The model reads a colossal corpus — books, articles, code, the public web — predicting the next word billions of times and getting marginally better each time. This is where the knowledge comes from, along with the grammar, the style, and the reasoning patterns.

It is also where the cutoff comes from. The model’s world stops on the day the corpus was assembled. Anything decided, published or amended after that date is simply not in there.

the corpus
Everything the model will ever “know”, read once, left to right.
cutoffThe world after this date does not exist for the model.

A judgment handed down the week after the cutoff is invisible — not obscure, not half-remembered. Absent.

03

What pretraining produces.

The raw pretrained model — the base model — is not an assistant. It has never been told to be helpful. Ask it a question and it may simply continue with more questions, because on the internet, questions are often followed by other questions. It completes text; it does not serve you.

It has the knowledge but not the manners. It is autocomplete with a doctorate. Turning that into something you can brief takes two more stages of upbringing — and you can watch them happen below.

04

One prompt, three upbringings.

“What should I do if I’ve missed a limitation deadline?”
BASE MODEL

What should I do if I’ve missed a filing deadline? What should I do if my solicitor missed a deadline? 17 answers · Legal forum · Sorted by newest —

Not an answer — a continuation. On the internet, questions are followed by more questions.
05

Taught to please.

Stage three is human feedback. Humans compare pairs of answers and pick the better one, over and over; the model is tuned towards the answers people preferred. This is where the helpfulness comes from — the structure, the clear headings, the appropriate caution.

It also has a side effect: agreeableness. A model rewarded for answers people like will lean towards telling you your argument is strong.

Why this matters for a lawyer

Praise from a model is weak evidence. It was trained on what people preferred to hear, and people prefer to hear that their skeleton argument is persuasive.

The fix is to make honesty the thing it is asked for. Adversarial instruction works with the training, not against it:

“Find the problems with this.”
06

The quirks, explained.

QuirkWhere it comes from
Hedges and caveatsHuman raters preferred caution
Agrees with you too readilyPreferred answers were agreeable ones
Confident tone even when wrongFluent confidence reads well; raters can't check every fact
Knowledge stops at a dateThe corpus was frozen at the cutoff
Better at common problems than rare onesPrediction is strongest where examples were plentiful
07

Raised, not built.

A model’s behaviour is the residue of its upbringing: the reading (pretraining), the schooling (instruction tuning), the finishing (human feedback). Nothing about how it acts was written down as a rule; all of it was absorbed.

That is why prompting works the way it does — you are steering patterns, not invoking commands. And it is why hallucination persists: a machine trained to produce plausible text will produce plausible text even when the facts have run out.

It is also why every generation of models behaves a little differently. Nobody rewrote the rules — there were never any rules to rewrite. The upbringing changed.

That’s the whole story.

From next-word prediction to a colleague you can brief: you’ve now seen every layer of the machine. Explore the full series, or put it to work.

Browse the full Explainer series →