May 28, 2026

Funnels, retention and paths from one filter model

Most analytics vendors will happily sell you three products. Funnels are one tier. Retention cohorts are another, usually a pricier one. Journey paths are a third, often bolted on with a "talk to sales" button. Each comes with its own dashboard, its own pricing line, and — if you're unlucky — its own instrumentation requirements. The pitch is that these are fundamentally different capabilities that happen to live in the same suite.

They aren't. Funnels, retention, and paths are three readings of exactly the same underlying object: an ordered stream of events, per session or per visitor, sitting in a table somewhere. Once you have that stream and a decent way to query it, all three fall out of the same machinery. TrackWhy builds them from one filter model. No re-instrumentation, no separate tools, no data team to wire it together.

This post is about that filter model — what it is, and how funnels, retention, and paths are just different ways of pointing it at your data.

One typed filter algebra

Everything starts with the same primitive: a filter over events and their properties.

An event has a name (pageview, signup, add_to_cart) and a bag of properties: the path, the referrer, the country, the UTM tags, and whatever custom props you've attached. A filter is a predicate over those. The simplest ones are leaf conditions:

event = "signup"
props.plan in ["pro", "team"]
referrer contains "twitter"
country = "DE"

Leaves compose with boolean operators into expressions:

event = "signup"
  AND props.plan in ["pro", "team"]
  AND country = "DE"

The important word is typed. Every field has a type, and the UI knows it. event is an enum drawn from the event names you actually send, so it offers =, in, and not in — not >. country is an enum of ISO codes. props.revenue is a number, so it offers >, <, and between. props.path is a string, so it offers contains, starts with, matches. referrer is a string too, but a structured one, so you can filter on its host.

Typing isn't cosmetic. It's what lets the filter builder stay honest: you never get offered country > 5, and you never have to guess whether a value is "true" the string or true the boolean. It also means the same expression can be compiled straight to SQL against the event store without a pile of runtime coercion. The filter algebra is closed under AND/OR/NOT, so any expression you can build in one place is a first-class value you can pass anywhere a filter is expected. Hold onto that — it's the whole trick.

Funnels: the filter as a step

A funnel is an ordered sequence of filter steps.

Define step 1 as a filter, step 2 as another filter, and so on. Conversion at step N is the count of sessions (or visitors) that matched step N having already matched steps 1 through N−1, in that order, within a conversion window. That's the entire definition.

So a signup funnel might be:

Step 1:  event = "pageview" AND props.path = "/"
Step 2:  event = "pageview" AND props.path = "/pricing"
Step 3:  event = "signup"

Three things matter here, and they're the things people get wrong:

Ordering is part of the definition. Someone who hit /pricing before the homepage and then signed up did not convert through this funnel in the way you drew it. A funnel is a sequence, not a set. If you only care about set membership, that's a different (and weaker) question.

The conversion window matters. "Signed up after seeing pricing" means nothing without "within how long?" A 30-minute window and a 30-day window measure different products. The window is a parameter of the funnel, not an afterthought, because it changes which step-2-then-step-3 transitions count as continuous.

Each step is just a filter from the same algebra. Step 2 isn't a special "funnel step" type. It's event = "pageview" AND props.path = "/pricing" — the identical expression you could drop into a segment, a path anchor, or a retention qualifier. Steps are filters that happen to be arranged in a list.

What you get out of this is step-to-step drop-off: the percentage that survives each transition. The biggest cliff is usually the most interesting thing on the page. And because each step is a filter, drill-down is free — click a step and you can pull exactly the sessions that reached it, or the ones that dropped right after it. "Show me the people who got to pricing and didn't sign up" is just the step-2 filter minus the step-3 filter, evaluated over the same stream.

Retention: the filter across time buckets

Retention asks a different question of the same data: of the people who showed up in some period, how many keep coming back and doing something that matters?

Cohort retention groups users by their first-seen period — the cohort. Everyone first seen in week 0 is one cohort, week 1 another. Then for each subsequent period you measure the share of that cohort who returned and matched a qualifying filter in period N. Render that as a grid — cohorts down the rows, periods across the columns — and you get the classic triangle, each cell a retention percentage that decays as you move right.

The qualifying action is, once again, a filter. "Retained" might mean any activity (event in [...]), or it might mean something with teeth like event = "report_run" or props.path starts with "/app". The engine evaluates one question per cell: did this user match filter F in week N? Same predicate, evaluated across time buckets instead of arranged in a sequence. That's the only structural difference from a funnel.

The cookieless caveat, stated honestly

Retention leans on stable identity across days, and this is where a privacy-first, cookieless product has to be straight with you. Without persistent cross-day cookies, durable visitor identity over long windows is genuinely limited. We're not going to pretend a device fingerprint is a user.

So frame retention where the identity is real:

Logged-in / account-based retention. When you pass us a stable account or user id for authenticated traffic, cohorts are exact, and this is where retention earns its keep — it's a product-engagement metric, and product engagement happens behind a login.
Session-model retention. For anonymous traffic, retention is best read at the session and short-window level, where the identity model is sound, rather than stretched into a 12-week claim it can't support.

Anyone selling you pixel-perfect anonymous 90-day retention without cookies is selling you a hallucination. We'd rather give you a number you can stand behind.

Paths: the filter as an anchor

Paths are the literal ordered sequences visitors traverse — the actual / → /pricing → /signup → /app walks through your site, aggregated so you can see the common routes and where they branch.

Two directions:

Forward paths start from an anchor filter and show what happens next. "Everyone who landed on /pricing — where did they go?"
Backward paths end at an anchor filter and show what led there. "Everyone who hit /signup — how did they get here?"

The anchor is a filter. The body of the path is the same ordered event stream you funneled and bucketed, just traversed instead. At each hop you aggregate the next (or previous) events across everyone still on the path, rank them, and branch. The most-trodden routes float to the top; the long tail of one-off walks collapses into "other." No new data, no new tracking — the difference between a path and a funnel is that a funnel checks a sequence you specified, and a path discovers the sequences that actually occurred.

Why one model beats three tools

Here's the payoff of keeping a single filter algebra under all of it: a filter is portable across every view.

Build a filter for funnel step 2 — event = "pageview" AND props.path = "/pricing". That same expression is a valid retention qualifier (did they come back and view pricing?) and a valid path anchor (what do pricing-viewers do next?). You author the predicate once and reuse it everywhere a filter is accepted, because the algebra is closed and the types line up. No translation layer between "funnel language" and "cohort language," because there's only one language.

And it makes drill-down real instead of decorative. Every table in the product — top pages, referrers, funnel steps, cohort cells, path hops — is a list of rows, and every row corresponds to a filter. Click it and that filter ANDs into the dashboard's active filter set. This is the honest answer to the perennial "what's past rank 10?" complaint: you can search, sort, and paginate the full distribution, then click any row to filter the entire dashboard down to it and re-ask all three questions in that narrower slice. Drill into "Germany," and your funnel, your cohorts, and your paths are now all German. One click, three views, same model.

None of this is cheap to compute naively — but it isn't naive. It runs over a columnar event store, ClickHouse-style, where scanning one column across hundreds of millions of events is what the engine is built to do. Funnel windows, cohort grids, and path aggregation are all column scans with grouping, which is the case columnar stores were designed for. So "fast at scale" isn't aspiration; it's just using the right storage for an append-only event stream.

You don't need three tools and a data team

Funnels, retention, and paths look like three products because three products are easier to invoice for. Underneath, they're one ordered event stream and one typed filter algebra, read three ways: arranged in a sequence, bucketed across time, or traversed hop by hop.

Build the filter once. Use it everywhere. Click any row to go deeper. That's the whole feature — and it's why you don't need three SaaS subscriptions and a data team to find out where your users drop off, whether they come back, and how they got there.