Classifying clusters: Using behavioral archetypes as a basis for personalized experimentation
I consider myself to be a very data-driven person. I have a degree in mathematics and economics, try to quantitatively measure almost everything possible, and look to data to demystify complex problems. Part of what makes my job so enjoyable as the Analytics Lead on the Data Science team at Cedar is that I get to approach the patient financial experience from a data-oriented perspective. It’s my job to understand and improve the journey that Cedar patients go through, and anchor that experience on observable KPIs and metrics.
As my colleagues on the Design team Amy Stillman and Diana Ye shared previously, we conducted an extensive research study in partnership with IDEO to better understand what drives the financial behavior of millennial patients. Because millennials are less likely to pay their healthcare bills than any other age group, we wanted to understand the reasons why and tease out the different patterns of millennial behavior, in order to better serve these patients.
As an output of this research, we developed four behavioral archetypes to identify actionable insights about our users:
- Seekers: Seekers are hungry for knowledge and will go to great lengths to understand the exact details of their bills. They are savvy and take steps to be in control of their finances
- Avoiders: Avoiders are skeptical and distrusting of their medical bills (and sometimes of the healthcare system as a whole). They tend not to engage with their healthcare bills at all.
- Hustlers: Hustlers are often living paycheck to paycheck. Although they are constantly trying to figure things out, they ultimately want to do the right thing and try to pay their bills. They don’t plan for emergencies and can be caught off-guard if one occurs.
- Planners: Planners have put systems in place to track and organize their medical bills. They believe that paying their bills is the right thing to do and take steps to consistently do so. They often have recurring health issues, chronic conditions or care for others in addition to themselves.
While my counterparts on the Design team dove deeper into what drives and motivates these archetypes, I wanted to understand how we could quantify and measure them: How many users identify as seekers vs. planners? What proportion of patients fall into each of the archetypes?
To answer these questions, we first had to align the behavioral archetypes to quantifiable features. We began by mapping each archetype to different actions taken on Cedar Pay, developing a detailed, comprehensive mapping across a wide range of user behavior. For example, high click rates examining bill details are associated with seekers, looking to obtain as much information as possible about their bill. Delaying payment until the last possible moment, on the other hand, is characteristic of hustlers who often need to prioritize which bills to pay first. Using these mapped behaviors, we wanted to develop a model that would score each Cedar user among the four archetypes to determine which one they most closely align to in the context of a particular invoice. This model would enable us to ascertain the number and proportion of Cedar users who correspond to each archetype, allowing us to design personalized experiments to support these individual archetypal segments.
For this model, we opted to use clustering algorithms. There are a few reasons for that choice - clustering algorithms are used to gain insights from data by seeing what groups data points most closely fall into or align with. The objective of a clustering algorithm is to ensure that the distance between data points in a cluster is minimized compared to the distance between two clusters. In other words, members of a group (in this case, our archetypes) are as similar as possible, and members of different groups are dissimilar.
More specifically, K-means clustering is the algorithm we selected to group similarities between archetypes among our patients. It’s an iterative clustering algorithm where the number of clusters (K) is predetermined and the algorithm iteratively assigns each data point to one of the K clusters based on feature similarity.
There are two potential issues with clustering that we had to keep in mind:
- Clustering is an unsupervised learning method, which means that it does not attempt to fit the data into any pre-existing categories.
- Feature selection and the number of features can drastically affect the results and skew the clusters towards certain features.
Creating our dataset
For our data set, we evaluated Cedar invoices over a 14-month period. In a cross-functional collaborative brainstorming session with the whole Makers team -- which includes Data Science, Engineering, Product and Design -- we identified the relevant features for the archetypes, incorporating actions that patients took and characteristics of their bill, ultimately including 16 different numeric variables. A few of the features we associated with each invoice included:
- Bill size
- Days to payment
- Self-pay insurance status
- Payment resolution status
- Engagement levels (e.g. % of clicks on all notifications, engagement with particular CTAs)
We subsequently standardized the features by subtracting the mean and then dividing by the standard deviation so that all data would be on the same scale, ensuring that different magnitudes in the data of the features would not have differentiated impacts on the model.
We ran a K-means clustering algorithm on the dataset to obtain four different clusters. These clusters are essentially quantifiable groups that correspond to the behavioral traits most associated with our behavioral archetypes - I’ll run through their detailed characteristics individually below.
Cluster 0: Planners (~59%)
At 59%, cluster 0 was the largest cluster of the dataset and corresponded with the planner archetype. The behavior of patients in this cluster suggested that their bills were expected and planned for. The financial obligation posed by these invoices seemed manageable for patients and were resolved in a timely manner. By and large, these patients did not seek additional information before paying their bills.
Cluster 0 was characterized by:
- Small bill sizes
- Balance after insurance payments
- High engagement
Cluster 1: Hustlers / Avoiders (~28%)
At 28%, cluster 1 was the second largest group and aligned most closely with traits of the hustler and avoider archetypes. These patients behaved as if they were receiving large, unexpected bills that they did not want to engage with. Even though engagement was low in this cluster, when they did engage, most of the patients sought help through chat (although most did not subsequently resolve their bills).
Cluster 1 was characterized by:
- Large bill sizes
- Self-pay insurance status
- Low engagement
Cluster 2: Seekers / Hustlers (~11%)
Cluster 2 encompassed 11% of invoices and was most linked to seekers and hustlers. These patients were unable to quickly pay their balance and needed flexibility and support; they were most likely to take advantage of Cedar’s flexible payment options to address their bills. Once they had the information and options that they needed, they were able to resolve their balances. These bills tended to take longer to be paid and nearly half were associated with self-pay status.
Cluster 2 was characterized by:
- Both small and large bill sizes
- Mixed insurance status
- High engagement
Cluster 3: Avoiders (~3%)
Cluster 3 was the smallest group, at only 3% of invoices. Overall, these patients avoided their bills, and as a result, did not resolve their balances. Invoices were typically larger, and only a few were paid. Most of the invoices were self-pay.
Cluster 3 was characterized by:
- Large bill sizes
- Self-pay insurance status
- No engagement
Where does this leave us?
As I mentioned at the outset, we created this model in order to discern quantifiable archetype segments and build experiments that can improve the patient experience based on specific needs. Our goal was to categorize patient behavior according to the patient behavioral archetypes. The archetypes were developed by examining extremes in behavior, with the understanding that if we can understand extreme behaviors, we can disentangle how patients act in between. If we can help the most complex patients, we can help all patients.
In reality, actual patient behavior is hard to group into specific archetypes. People are complex. They often possess attributes from multiple archetypes, and their behavior can fluctuate depending on the context of their bill.
But the clusters reveal important insights and can guide our understanding of patient behavior, illustrating that patients come to Cedar with different needs. Examining our results, it’s clear that for the planners (cluster 0), Cedar already addresses their primary needs. They feel supported, understand their obligations and generally pay their bills in a timely fashion. But there are opportunities to improve the patient experience for hustlers/avoiders (cluster 1) and seekers/hustlers (cluster 2). For example, while patients in cluster 2 already take advantage of Cedar’s customizable payment options, they could potentially benefit from easier access to or more enhanced visibility into all their options to streamline their user experience.
These insights and this framework are particularly useful as we evaluate current experiments and design future ones. Using the archetypes and clusters in conjunction with our propensity to pay model gives us a better understanding of how we can personalize patient experiences at Cedar.
In the next installment of Cracking the Millennial Mindset, we’ll discuss how we can design personalized user experiences to improve outcomes for specific patient archetypes. If you’d like to receive updates and alerts as new posts are published, click here to join the Cedar mailing list.
Rena Yang is head of analytics at Cedar. Prior roles include business development at agricultural tech firm Gro Intelligence, and senior analyst at economic consulting firm Cornerstone Research.