In an earlier blog post, we spoke about typical predictive models that are built in a CRM setting: the propensity of a customer to transact (either unconditionally, or of a specific type or in a specific product category), churn, repeat a transaction etc. In this pot, we shall discuss an additional model type, namely the propensity to engage with a brand when targeted through a specific channel.
Currently, businesses have multiple channels through which they can contact customers: SMS, Email, WhatsApp, App Notifications etc. These vary by both cost and effectiveness – naturally, the channels that cost more tend to have a higher engagement rate! So, the obvious question before a decision maker is, how to pick the channel to use, and for whom?
There are a number of factors that play into this – budgets, channel effectiveness, the possibility of using a combination of outreaches to achieve the objective and other practical constraints.
Imagine a setting where we have an overall budget, a set of channels, and historically observed response effectiveness scores for each channel. At a broad level, we can model this as follows:
Feature engineering
Whenever we are faced with a modelling question of this nature, we can reframe it as: Given what we know today, can we predict…?
The key phrase there is: what we know today. Obviously, this includes the kind of features we were computing anyway for the other model types. However, we might need to focus a bit more on features that track past outreaches and their corresponding response. For instance:
- How many outreaches did we make to this customer in the last N days? (Overall and by channel)
- What percentage of these outreaches (overall and by channel) resulted in a delivery/open/read/click/transaction event?
- What was the specific response status of the last few outreaches (overall and by channel)?
These matter especially when it comes to predicting something like WhatsApp delivery because, more than the customer’s favourite purchase category, it is the past engagement history through WhatsApp that is more likely to determine whether the next message will get delivered.
Model building pipeline
This will be somewhat like the way other models are built, but with one crucial difference: The data here will not be for all customers, but rather for customers who have actually been targeted in the past.
For instance, consider the problem of predicting who will read a WhatsApp message. Out of a database of, say 1M customers, let us assume that 200K were targeted through WhatsApp in the week of 6-12 Jan 2025. For these 200K customers, we can calculate the feature list as of the end of 5 Jan 2025. We can then calculate the target variable as 1 or 0 if they read the message sent to them between 6-12 Jan 2025.
By doing this over a period of several weeks, we can collate data about which customers responded to their outreach, and how it correlates with what we knew about them prior to that outreach. It is possible that the same customer appears in the dataset based on the measurement at different points in time. This way, if there is a customer whose other characteristics remained more or less the same, but response changed based on the outreach history, it gets captured within the model.
The rest of the pipeline remains the same.
Performance evaluation
There is an interesting aspect to performance evaluation when it comes to channel response models. This has to do with practical constraints. For example, WhatsApp message sending may be curtailed if the read rate of marketing messages is lower than a certain threshold. This means that we want to restrict our message sending to the appropriate subset of customers wherein the read rate is maintained.
What does this mean from a performance evaluation standpoint?
Imagine that we do a validation of our model on 100k customers from historical data. To simplify, use the example we discussed in the previous section, and imagine that 80k out of the 200k customers who were targeted between 6-12 Jan 2025 are used for validation. Which means, the model is built on the remaining 120k customers, and it is used to generate scores for these 80k customers in the validation sample. Let us take these 80k customers and divide them into 10 deciles based on the model score – the first decile contains customers with the top 10% of scores, the second decile contains those with the next 10% and so on. The table below depicts the actual response seen in each of these deciles (since this is from the historical data, we already have that information).
Decile | #Customers | #Read | Cumulative #Read | %Read of Total Read (Cumulative) | %Read of Total Targeted (Cumulative) |
1 | 8000 | 6000 | 6000 | 30% | 75% |
2 | 8000 | 5000 | 11000 | 55% | 69% |
3 | 8000 | 3000 | 14000 | 70% | 58% |
4 | 8000 | 1600 | 15600 | 78% | 49% |
5 | 8000 | 1400 | 17000 | 85% | 43% |
6 | 8000 | 1000 | 18000 | 90% | 38% |
7 | 8000 | 800 | 18800 | 94% | 34% |
8 | 8000 | 600 | 19400 | 97% | 30% |
9 | 8000 | 400 | 19800 | 99% | 28% |
10 | 8000 | 200 | 20000 | 100% | 25% |
If we wish to ensure a read rate of at least 40%, then we can target the top 5 deciles, which has historically given us a rate of 43% according to the table above. On the other hand, if there is no read rate constraint and we wish to target enough people to get at least 75% of those who might read the message, then the top 4 deciles will do.
In conclusion…
Solving the budget allocation problem for various channels requires a lot of moving parts to be resolved. But better decisions are made with better data, so predicting the likely effectiveness of each channel at a customer level allows us to then make the best decision at that level.