More about The Rhythm of Identity Project

Music Diversity Among Listeners 18-24

This group project is an insight into the music preferences of listeners aged 18-24 derived from one set of data. This data was gathered between 2018 and 2024 and represents the listening habits, favorite music, and demographic identities of thousands of individuals worldwide.

Where did the data come from?

The data comes from Kaggle, a website that comprises sets of data for free use. The original dataset is called “Analyzing music streaming trends, listener habits & preferences.”

What is inside the data?

The dataset represents 5,000 users all over the world. It reports demographics and music listening habits of each individual, such as the listener’s age or favorite genre. The data was gathered between 2018 and 2024 and will not be updated. Each row of data is a surveyed individual. Each column is a surveyed characteristic: age, country, streaming platform, top genre, minutes streamed, number of songs liked, most played artist, subscription type (free/premium), listening time (morning, day, night), discover weekly engagement (% of new music discovery engagement), repeat song rate (%).

How did you transform the data?

My first transformation was converting the dataset from Kaggle to a Google Sheets file. Then, because we planned to analyze data from only 18-24-year-olds, I removed much of the data from the spreadsheet. I also froze the header row so each column’s name could remain visible when scrolling down.

How did you analyze the data?

We had to continue making transformations to analyze pieces of the data. For instance, I organized each listener by their country of origin and replaced each top artist with the country of that artist’s origin. Through formulas and organizational tools on Google Sheets, such as the ones I used to identify and quantify listener and artist countries of origin, we quantified trends inside the data. Sometimes we knew what we were looking for inside the data. Sometimes we did not. Sometimes we were surprised.

How did you make insights into young people’s music preferences?

As we divided, organized and analyzed the data, we found intriguing trends that said something significant about how the surveyed individuals listened to music. For instance, I analyzed that high streaming did not correlate with a high weekly rate of discovering new music. From that fact, and the fact that high repeat was common, I inferred that 18-24-year-old listeners often did not listen to new music, even if they listened frequently. We also inferred that young people’s listening is globalized and that they all like pop musicians, even though most of their top genres are not pop.

How did the data become the story?

We knew communicating the data had to look very different from the data itself. And, we understood that a vibrant topic like music demands vibrant visuals. We chose to share our insights through sliding images to harness the strengths of images: colorful, graphic and instant. We planned for our visualizations to also have these qualities. When we did choose words, though, our goal was to be simple and direct. That is why we chose to pull individuals from the data and describe them. That is also why we chose types of graphs –bar, dot and chord– that best represented each insight. For example, the overlapping chords of the globalization visual is an accurate illustration of cultures crossing over.

How is this project divided among each author?

The collaborators on this project are my peers, Emma Coffey and Ari Stalcup. Emma is responsible for the pop music visual. Ari made the repetition and discovery visual. The two of them developed the story's title. I made the slides, wrote the insights and made the globalization visual. I also wrote the text on this page.

Back to the project