What AI Needs Before It Can Answer a Business Question
Last week, I spent a morning at Snowflake's Data for Breakfast event, which is exactly what it sounds like: an early session where practitioners walk through what's new in the platform and where customers are actually getting value.
The through-line across most of the presentations wasn't a specific feature. It was a concept that kept surfacing as a prerequisite for making AI work reliably against real business data. They called it the semantic layer.
The framing was direct: without it, AI tools querying your warehouse will produce answers that are fast, fluent, and inconsistently grounded in what your business actually measures. With it, you give the AI something reliable to reason against.
That framing stuck with me — partly because it applies well beyond Snowflake, and partly because it names something that data teams have wrestled with long before AI entered the picture.
What a Semantic Layer Actually Is
The term sounds more technical than it needs to be. At its core, a semantic layer is a shared definition of what your data means — codified in a way that systems (and people) can use consistently.
It sits between your raw data and whatever is consuming it, whether that's a BI dashboard, an AI assistant, or a direct query from an analyst. Its job is to answer questions that your data schema can't: What is "revenue" here? Which customers count as "active"? How do we measure "churn"?
Those definitions exist in every organization. They're just rarely written down in one place. Instead, they live in a spreadsheet someone built three years ago, in a thread that's been archived, in the institutional knowledge of two senior analysts who remember why a particular field gets filtered a certain way.
The semantic layer is what happens when you decide that those definitions are too important to leave scattered.
Why This Matters More Now
This problem isn't new. The semantic layer has always existed in every data-driven organization. It just hasn't always been written down.
It lived in your analysts. The person who knew that revenue figures need to exclude intercompany transactions. The one who remembered why the customer count in the marketing dashboard is always slightly different from the one in finance. The analyst who, when handed a vague question, knew which version of "active user" the executive actually wanted.
That institutional knowledge was the semantic layer. It was just stored in people instead of systems.
AI tools change that equation. When you ask Copilot or Claude or a Snowflake Cortex assistant a business question, you're essentially asking it to do what that analyst did — interpret ambiguous language, apply organizational context, and return something meaningful. The difference is that the AI has no institutional memory to draw on. It can't ask a clarifying question the way a good analyst would, and it won't flag uncertainty the way a careful one would.
If you want consistent, trustworthy answers from AI tools, you have to externalize what your analysts have always carried internally. That's what building a semantic layer actually means in practice: taking the knowledge that lived in people and codifying it somewhere a system can find it.
What Goes Into One
A semantic layer isn't a single file or a single tool. It's a collection of decisions that get made explicit. The easiest way to see what that means is to start with something familiar.
Say you want Claude or Copilot to help you write blog posts consistently. A semantic layer for that use case is just a document the AI references before it starts. It defines your audience precisely enough that the AI doesn't have to guess. It captures what to avoid — "don't open with a broad generalization" is more actionable than "be specific." It establishes structural conventions: how you open, whether you use headers, how you close.
That's it. Not a style guide. Not a template. A concise set of definitions that replace the assumptions the AI would otherwise make on its own.
Scale that idea to a data warehouse and the components change — revenue, churn, time conventions, which records to exclude — but the principle is identical. Take the knowledge that currently lives in people, and put it somewhere a system can find it.
How to Actually Build One
Start by identifying where the AI is getting it wrong. Those gaps are your scope. For a writing assistant, it might be tone drift or openings that keep generalizing. For a data tool, it might be headcount figures that include contractors, or revenue numbers that don't match finance.
Write each definition in plain language first. If you can't write it clearly, the definition isn't settled yet.
Then give it to the system. For a chat-based tool, a system prompt is the most direct path:
"Before drafting, apply the following: audience is engineering managers and staff engineers with 10+ years experience. Tone is direct and practitioner-grounded — no hype, no absolutes. Never open with a broad generalization. Use short paragraphs and active voice. Close with synthesis, not a call to action."
For something more reference-heavy — a full voice guide, a list of metric definitions — attach it as a document and instruct the model to consult it:
"The attached document defines how I measure revenue, customer activity, and churn. Use these definitions for any analysis. If a question touches a metric not defined here, flag it rather than assuming."
Assign one person to own it. A semantic layer without an owner drifts — definitions go stale, edge cases accumulate. Ownership doesn't have to be formal, but it has to exist.
When the AI produces something wrong, ask whether it's a model failure or a definition gap. More often than you'd expect, it's the latter. Treat bad outputs as diagnostic signals and update the layer accordingly.
The Bottom Line
AI tools are good at reasoning. They're not good at knowing what your business means by "revenue" unless you tell them.
The semantic layer is how you tell them. It's not glamorous work. There's no demo that makes it look impressive. But it's the difference between an AI assistant that gives fast, fluent answers and one that gives fast, fluent, accurate answers.
That's a distinction worth investing in.

Comments
Post a Comment