Guide to Customer Churn Prediction
Learn the different ways to predict churn, which models to use and how AI is making churn prediction and prevention more powerful.
What is customer churn prediction?
Customer churn prediction is the process of using data from a variety of sources to identify which customers are likely to stop using or paying for a product or service. The goal is to spot customers who are likely to leave before they actually do and take appropriate action to ensure they stay.
Why is it impactful?
Two key reasons, revenue impact and effectiveness in retaining customers.
Nothing hurts business growth faster than a churn problem, new customers are expensive to acquire and a large portion of revenue growth comes from existing customers. This makes proactively dealing with churn critical.
Churn prediction enables you to deal with churn proactively which is a lot more likely to succeed than reacting once the customer has already made their mind up. Imagine you are not having a great experience with a new service you are trying, if someone reached out to help you are much more likely to stay and find value in the new service. Now imagine, that someone reached out after you cancelled? The amount of effort required to change your mind and go back to the product is much larger and this is true for your customers too.
How do you actually predict customer churn?
Definitions: What counts as churn
This is the first and most important thing to predicting churn: What counts as churn? It sounds like a simple question, but it is a little harder than you might think.
Let’s start with easy pieces. Is churn:
- An account that stopped paying?
- An account that is not using the product after some period?
- An account that has not made a new purchase sometime after their previous?
Next, how do you define “an account”? Is it a user, a team or a company? This matters a lot for scenarios where you might have multiple teams or users from the same company paying separately. Is the company churned just because a single user has? Is the company even converted at that point?
If you are a subscription business, you will want to think about the moment of churn. Assuming churn is the user no longer paying, does this happen:
- When a paid subscription ends?
- When they click cancel or make the decision not to renew?
Having a clear answer to what is churn, when do we count it and what are we predicting churn on is critical to being able to predict it.
Features: Signals that predict churn
Now that we have decided what churn is, we can start collecting signals for predicting it. These signals will eventually be turned into what the machine learning world calls ‘Features’. Features are pieces of information you feed into a predictive model to both identify patterns and make predictions.
Start by thinking about how you, your customer success team or even the customer themselves would predict whether an account would churn and what information you would want to make that prediction.
For instance, you may believe that if a customer doesn’t do a certain action during onboarding they might churn or that certain industries or roles convert more than others.
Once you have a sense of what signals you want to use, you will need to obtain the data for them. These might come from billing providers (e.g. Stripe), analytics tools (eg. Mixpanel, Amplitude), CRMs (e.g. Hubspot, Salesforce) or other sources for things like demographics or firmographics.
Each of these signals will need to get turned into features which a machine learning model can understand. Let’s say you wanted to turn the fact that a user did a specific action into a feature, this is how that might look:
You can read more about feature engineering here.
Now that you have a good set of signals and the data to support it, you will need to turn that into specific features and training data.
Training data: Examples of churn and happy customers
Training data can be thought of as a big spreadsheet of every feature from the signals discussed above and the outcome of whether the user did or didn’t churn.
You will want this data for the period you want to predict churn for. For example, if you wanted to predict 28 days ahead of time you will need training data for each user that did or did not churn for 28 days.
You will also want to make sure that none of your training data leaks the outcome otherwise the machine learning model will ‘cheat’. For instance, if you have a number of active subscriptions, you don’t want that to be set to zero for the training data before the time they cancelled even if in your production data it is set to zero.
If you do leak the outcome in your training data, the model will learn to rely on that leakage rather than the actual signals you will have when running the model.
For model training you will want a decent number of samples –more than 10,000 is a good place to start – across your user base with people who churned, didn’t churn, people on different plans, from different locations, and so on.
Once you have a training data set, you’re ready to train a model.
Models: Machine learning to actually make predictions
This post will not go into the full detail of training machine learning models, however we will talk briefly about some of the different model types you can use and some of the systems you could use to train your own model.
The table below outlines some common ways you can achieve good results when creating a churn prediction model. However using a tool like AutoML – which finds the appropriate model for you – will make for a much simpler process at a slightly increased training cost.
Explainability: Going beyond a likelihood and knowing why
Models deliver a score between 0 and 1 with 0 being extremely unlikely to churn and 1 being extremely likely to churn.
That is super useful, but the inevitable question is, “Ok, why?” Which things make the model believe this person is more or less likely to churn?
Feature explainability is a way of doing this. For simple models like gradient boosting and random forests, feature importance or explainability is trivial and most libraries will support extracting it.
For more complex models more effort is required and is outside the scope of this article.
Using predictions in the real world
How to get the information to the right people
Now that you have a model that returns a prediction and a way to explain the why behind that prediction, you are probably going to want to do something with this model.
Ask yourself: how are these predictions going to be used? Is a sales team going to want to see them in Salesforce? Is a customer success team going to want to have alerts in Slack based on high churn scores? Does the product team want them to trigger different experiences in the product, or does marketing want to send personalized emails based on a high score?
In all of these cases you are going to need to:
- Get access to live data to make predictions
- Make predictions either in a batch or in real time
- Send the output somewhere whether that be to a backend, a file or another system like a CRM
Google Cloud, AWS and open source frameworks have guides on how to take a model and serve it or run batch predictions against a set of data.
In most cases you will want to not only continuously run these predictions as real world data changes, but also retrain your model often to account for drift in the underlying data, seasonality, new pricing plans and other changes.
What plays you can run with churn predictions
You have a churn score which you can constantly update. Now, how can you use it to actually reduce churn?
There are many more things you can do with churn predictions but here are a few examples:
- Customer success reaches out to high value customers who have high or increasing likelihood to churn
- Send automated “How to get started” emails to newly signed up customers with high churn scores
- Market upgrading to a higher plan to customers with low churn scores
- Inform customers who are highly likely to churn about the option to pause their subscription
- Send offers if users are highly likely to churn before their renewal
- Reach out for feedback to customers who have a high likelihood of churn
- Look at what is working for customers with low likelihood of churn and use that to target new customers or get existing customers in the same state
Putting it all together
Churn prediction is used across many businesses to retain tens of millions (even hundreds of millions) in revenue. It’s an effective tool to increase net revenue retention and offer a more proactive approach to customer experience and engagement.
Building out a customer churn prediction model, connecting it to the services your teams use and then running the plays which gain revenue can take time, expertise and clean data.
Luckily, products like Upollo turn this from an ordeal into something you can be up and running with in minutes. Upollo connects with all of your existing tools like CRMs, subscription and billing systems, product events tools and more to predict churn, conversion and expansion.
Upollo learns from billions of data points to have constantly updated models that learn from users and customers all across the globe.
Keen to try it out?
Read the Report: Upollo SOC 2 Type 1
Join the Wave
Ready to revolutionize how you recognize opportunities? Sign up for the waitlist below and be among the first to experience its transformative power when it launches.
Get Started for Free
Start understanding and upselling your customers today.