Written by Hajar AIT EL KADI, Koffi Cornelis
How to create the best customer experience?
Know Your Customer, The Importance of Knowing Your Customer.
Nowadays, companies have more data than they know what to do with. And data are potential.
The more you know your customers, the more you understand their needs, the better you can anticipate them.
The most successful companies are those which know their customers the best, and use that knowledge to provide a high-quality customer experience, thus creating strong customer engagement. Providing a unique individual experience that is tailored to the customer’s preferences and needs is key to building life long customer relationships. Look no further than Spotify and Netflix whose recommendation algorithms are the at the core of their business.
The data are hence indispensable in providing insights into customer behaviour. And what better way to extract those insights than Machine Learning.
In the following article, through an example of a retail dataset, we will attempt to understand consumer behaviour and anticipate it, using a combination of data analysis and machine learning.
We will first start by flirting a little with the customer profiles, getting to know them a bit better, before we take them on a second date and look at their transactions and ask about their history. If all goes well and we build a deeper connection, we’ll be able to label our relationship. Then things will get serious and we’ll be comfortable enough to recommend their next purchases 😉😏.
Literature provide us with a variety of recommendation systems depending on the available data, whether it concerns users, items, or interactions between them:
In part 2 of this article, we cover two recommendation systems: content-based, and collaborative-filtering. We compare them to two baselines that made the most sense to us.
With a pinch of knowledge, a dash of luck and a whole lot of determination we are diving right in 🚀 (get it ?)!
We will cover the following topics:
For those of you with commitment issues, we provided an estimated read time for each topic.
Let’s get this party started!🤩
★ What we want :
★ What we need :
★ What we use:
▹ Plotly, Seaborn, Matplotlib
Henceforth, we will be focusing on the results rather than detailed code. The entirety of which can be found here (Google Colab) and here (GitHub).
We used a kaggle dataset containing three tables: Customers, Transactions and Products.
The dataset details retail transactions spanning from Jan 2011 to Feb 2014. We will look at the tables separately and then together.
This table contains the general customer data, and accounts for 5647 unique customers.
The customers are uniformly distributed across both genders.
We calculated the age of the customers as of the end of 2014 (the last recorded transactions were back in 2014) based on their date of birth. The average customer is 33 years of age.
The sweet spot, young enough to still be fun and adventurous with their buys but old enough to have a reliable job to finance those impulse purchases 😌.
There are 10 city codes in the dataset. And the customers are uniformly distributed across the 10 cities.
The dataset describes 23 products organised across 6 categories and 18 subcategories.
This table details about 23k transactions that took place between Jan 2011 and Feb 2014. We consider the transactions with negative total amounts to be returns that the customers made following their purchases.
We will have a little fun plotting the data.
The sales are more or less uniformly distributed across the board. The highest revenues seem to be generated around March and April of each year. The highest recorded monthly revenue is 1.5M in January 2014.
We group the transactions per customer and sum their purchases over the course of a year. The following plot shows the evolution of behaviour for customers over the course of the three years (keep in mind that we only have two months’ worth of data for 2014, hence the low revenue).
Some customers are consistent over the years like customer 7 who buys more or less the same amount each year. Others, buy less and less from year to year, like customer 2 and and customer 9. Some customers seem to have lost interest over the course of the years but bounced back in 2014 like customer 1 and customer 6, who also generate higher revenues compared to other customers. While other customers buy more and more each year like customer 4 and customer 3.
After merging the products dataframe with the transactions dataframe, we group the transactions by category and year. The plot bellow shows the most popular items: ‘Books’ (no, seriously! we checked the math, a couple of times over 🤷♀️) and ‘Electronics’ (no surprise there), which hold the top sales over the years. The ‘Clothings’ and ‘Bags’ categories generate the least revenue.
Returns aside, let’s see if the customers buy the same item more than once. The histogram shows that most clients do not come back for seconds.
The e-Shop records the most transactions. The rest of the stores have about the same number of transactions.
We consider the recorded negative total_amt as returns. About 10% of the transactions recorded are returns.
Return rates at physical stores range from 8% to 10%, but rise to approximately 20% for e-Commerce. For Amazon it can go as high as 40% for certain categories.
We should be glad that our customers are loyal (or just lazy) enough not to bother with the hassle of returns 👀.
The returns are consistent over the years. The low returns recorded for 2014 are due to having only 2 months of transactional data available.
‘Books’ and ‘Electronics’ have the most returns per year. Which is consistent with the fact that they record the most transactions overall.
At this point, we are just out of the awkward phase. We know our customers a little better: their age (so we never forget to wish them happy birthday), where they live, their favourite store, when they make the most purchases and what they spend most of their money on: books (we just hope they’re not pretending to be avid readers to impress us. It will be awkward for the both of us when they get a book discount code for their birthday gift).
Now that we know our customers’ collective behaviour. We want to get up close and personal with them. It is time to put a label on them (loyal👰🤵 or a player, forward 😉 or shy, likes to spend money 💸 or a cheapskate etc). Shine light on their qualities and pretend we’re okay with their bad choices.
Communication is the basis of any healthy relationship. And customer relationship is no different.
But how do you communicate effectively with 6000 customers?
In order to optimise communication, we will divide our customer base into groups with similar behaviour. The ultimate goal is to get to know the customers on a deeper level and separate them into categories so we can tailor the content to their needs. The fancy marketing term for that is customer segmentation.
To build those groups, we will mostly rely on the customers’ purchase history. One of the most intuitive and flexible ways to do that is RFM analysis. It stands for Recency, Frequency, Monetary Analysis.
✓ Recency (R): How recently did the customer make a transaction?
✓ Frequency(F): How many transactions did he make?
✓ Monetary Value(M): How much did he spend?
We compute those three criteria and consider them as scores. Those scores will, in turn, serve to cluster the customer base.
The recency averages at 282 days. The least active customer hasn’t made a transaction in 3 years. The recency distribution is positively skewed.
The average customer made about 4 transactions over the course of the three years.
The average customer spent 2224 in monetary value over the course of the three years. The monetary value is also right skewed.
Now that we have the three columns on hand. We move on to clustering by feature. For this purpose, we will call on the magic of machine learning. We use K-means, an unsupervised learning algorithm that uses distance to determine the best suited cluster for each data point.
In order to determine the optimal number of clusters to input into the k-means algorithm, we use SSE plots (elbow plots).
Accio elbow plots!
Three is the magic number for the three features.
There is nothing left for us but to cluster them customers 👨👩👧👦.
The code below uses K-means to create the clusters based on recency. We use similar code for the frequency and the monetary value.
We get three clusters based on each criterion.
The cluster number serves as a customer score for each feature. In the case of recency for example: cluster ‘0’ represents the least active customers whereas cluster ‘2’ represents the most active.
Now that we have a score (clusters) for each of the three features, we will compute an overall score by summing the three scores for each customer. We end up with 6 clusters. Below is the descriptive of the mean per cluster for each feature.
In order to better visualise our clusters, we segment the 6 overall scores as follows:
0 and 1: Low-Value Customers
2, 3 and 4: Mid-Value Customers
5 and 6: High-Value Customers (we stan 💕)
This segmentation allows us to plot the three features against each other in terms of customer value.
We can clearly deduct the behaviour of the three distinct groups of customers:
And voilà! Beautifully segmented clusters.
We can see that the most impactful factor is recency. It clearly separates between the three segments of clients. The lower the recency the higher the customer value. The low value customers also have the least monetary value.
What is left to do, now, is build marketing campaigns tailored to the needs of those groups of customers.
And there you have it!
We have, somewhat successfully, been able to profile our customers and segment them into groups that have similar history and behaviour.
But fear not…
In the next chapter, we will dig deeper into our customer relationship, and take things further by engaging our customers and recommending their next purchases.
In order to do so, we will compare two recommendation system approaches: content based and collaborative filtering.
In the words of Queen B, it’s time to put a 💍 on it!
For the brave of heart, join us in part 2.
Links (for those who weren’t paying attention)
Kaggle (dataset)
References
[2] https://towardsdatascience.com/customer-segmentation-in-python-9c15acf6f945
[3] https://towardsdatascience.com/customer-segmentation-with-kmeans-e499f4ebbd3d
Notre Manifeste est le garant des droits et devoirs de chaque CodeWorker et des engagements que CodeWorks a vis-à-vis de chaque membre.
Il se veut réaliste, implémenté, partagé et inscrit dans une démarche d'amélioration continue.
Tu veux partager tes connaissances et ton temps avec des pairs empathiques, incarner une vision commune de l'excellence logicielle et participer activement à un modèle d'entreprise alternatif, rejoins-nous.