The Databoard
click to continue ↓
An open method
The Databoard
An AI-powered verbal
visualization of data.
"The limits of my language are the limits of my world."
— Ludwig Wittgenstein, 1922
AI changed the limits of our world.
The limits of the language are now the limits of a new one.
Step 1

This is how we show
Titanic survival data.

A bar chart. Authoritative. Clean. In textbooks for decades.

Survival Rate by Passenger Class RMS Titanic · 1912
1st Class
63%
survived
2nd Class
47%
survived
3rd Class
24%
survived
Crew
24%
survived
What the graph says
Wealth determined survival. 1st class survived at nearly 3× the rate of 3rd class. The conclusion is clear. The graph is closed.
But did wealth cause survival — or was it a proxy for something else?
Step 2

What if we used
human vocabulary instead?

A group describes what they know about what actually happened on that ship — not column names, but words from domain knowledge and lived reasoning.

None of these words appear in the original dataset. They come from what humans know: proximity to exits, whether passengers understood the crew, whether they knew the ship was sinking.
upper deckbelow deck near exitfar from exit informed earlyinformed late unawareunderstood crew language barrierwith family aloneequal access chose to staytime to board
Which of these survive scrutiny?
Step 3

The AI evaluates
each word.

Not generating answers — filtering human proposals. Which words are grounded in observable fact? Which carry assumptions? Which project conclusions the data cannot support?

grounded
assumed
interpretive
rejected
upper deck below deck near exit far from exit informed early informed late unaware understood crew language barrier with family alone equal access chose to stay time to board
"Equal access" didn't survive. That rejection is itself an insight.
Step 4

The vocabulary splits
by joker words.

Green = dominant in this group. Yellow = present but rare. Red = edge case.

▲  Who Survived
upper decknear exit below deckinformed early informed lateunaware understood crewlanguage barrier with familyalone time to board
vs
▼  Who Died
upper decknear exit below deckinformed early informed lateunaware understood crewlanguage barrier with familyalone time to board
Step 5

What the vocabulary
found that the graph missed.

3rd class is yellow on the death board.
"Below deck" is green.
Yellow means present. Green means dominant. Class appears on both boards — it is not the differentiating variable. "Below deck," "unaware," and "language barrier" are green on death and red on survival. Class was traveling with the real variables. It was not the real variable. A multivariate analysis on class would find a strong correlation. The vocabulary shows what that correlation was proxying for.
"Understood crew" is green on the survival board.
But "language barrier" is green on the death board.
Same dimension. Perfectly inverted. Most 3rd class passengers were emigrants — with only a handful of translators on board, many couldn't understand the crew's instructions. The graph had no column for this. The vocabulary found it.
"Language barrier" and "unaware" are both green on the death board.
Two separate tiles — but together they tell one story. Many 3rd class passengers couldn't understand the crew AND didn't know the ship was sinking. Neither tile alone explains the death rate. Together they do. That combination is invisible in any single graph.
The words that mattered most —
distance, language barrier, informed early
were not in the data.
What if building the vocabulary
was part of the analysis?
→   See the demo→   Read the method→   Case study