Product
11 min read

How AI reads customer conversations to catch risk early

The clearest churn signals live in language, not telemetry. Here is how modern AI turns thousands of messages, calls, and threads into structured, account-level risk you can act on.

EB

Eli Brandt

Founding Engineer

For most of the history of Customer Success software, the qualitative half of the customer relationship was invisible to the system. Product analytics could tell you what customers clicked, billing could tell you what they paid, but the actual texture of the relationship, the tone of a support thread, the hesitation on a renewal call, the offhand "we might revisit this next year", lived only in human memory. The richest churn signals were the ones no tool could read.

That constraint is what modern language models have lifted. AI can now read customer conversations at the scale and consistency a human team never could, and turn unstructured language into structured, account-level risk. Here is how that actually works, without the hand-waving.

The problem with reading conversations by hand

A CSM with eighty accounts cannot read every Slack message, every email, every meeting transcript across their book, every week. Even if they could, they could not hold all of it in mind at once to spot that three accounts independently mentioned a competitor this month, or that sentiment in a key account has been drifting downward since February. The bottleneck was never that the signal was missing. It was that comprehensively reading and correlating it exceeded human bandwidth.

What the AI is actually doing

Reading conversations for risk is not a single trick. It is a pipeline of distinct steps, each grounded in evidence so the output can be trusted and traced.

1. Ingest and attribute

Every message, email, call transcript, and note is pulled from the source tools and attributed to the right account and the right people. This sounds mundane and is foundational: a signal you cannot tie to a specific account and renewal is just noise. Knowing that a particular Slack thread belongs to a particular at-risk customer is half the value.

2. Extract meaning, not keywords

Older tools scanned for trigger words like "cancel" or "unhappy" and drowned in false positives, because language does not work that way. A customer saying "we are definitely not going to cancel" trips a keyword filter for exactly the wrong reason. Language models read for intent and sentiment in context: frustration expressed politely, enthusiasm that has cooled, a buying signal phrased as a casual aside. The unit of analysis is meaning, not string match.

3. Correlate across channels and time

A single cool email is weak signal. That same email, alongside a support escalation last week and a 20% usage drop this month, is a pattern. The real power is in correlation: tying the qualitative signal from conversations to the quantitative signal from product and billing data, and watching how it all moves over time for a given account. Risk is a convergence of signals, and convergence is what the system is built to detect.

4. Surface with evidence attached

The output is not a black-box number. For every account flagged at risk, the system points to the specific messages, calls, and events that drove the assessment. A CSM should be able to click into a red account and read the two escalations and the cooling thread that explain it. Without that, you have replaced one form of guesswork with another.

The goal is not to replace the CSM's judgment. It is to make sure their judgment is applied to the right ten accounts, with the relevant evidence already in front of them, instead of being spread thin across eighty.

Why evidence and traceability are non-negotiable

A risk score nobody trusts is a risk score nobody acts on. The single biggest determinant of whether a team adopts an AI health signal is whether they can see why it fired. This is why grounding every assessment in the underlying conversations matters more than squeezing out another point of model accuracy. Trust is the product. A slightly less precise score with visible evidence will be used; a marginally better one that asks for blind faith will be ignored.

How Merrily does it

Merrily runs exactly this pipeline across the sources you already use: Slack, Gmail, Granola and other meeting notes, your CRM, support tools, and product analytics. It reads every customer conversation, attributes it, extracts intent and sentiment, correlates it with usage and billing, and rolls it into one live health score per account, with the source messages always one click away. The reading that no Customer Success team has the hours to do, done continuously, with the receipts attached. For where these conversational signals sit in a complete health model, read what a customer health score should measure.

See this on your own accounts

Connect your stack and watch your first health scores land within the hour.