MediumSpotify

Analyze Song Data

Prompt

You're given a string of listening data where each entry is a triplet: person, song, artist, separated by commas. Write a function analyzeSongData that processes this data and returns:

  1. How many times the most popular song was played
  2. The name of the most popular song
  3. A function getTopListener(songName) that returns who played a given song the most
const data =
"Martin, By your name, NasX, Kie, Jigsaw, Radiohead, Chris, Own worst enemy, Lit, Kirst, Dance yrself clean, LCD Soundsystem, Amy, Jigsw, Radiohead, Berta, Own worst enemy, Lit, Junko, Re-sublimity, KOTOKO, Chiaki, Jigsaw, 'Radiohead' ";

Look closely at the data. It has some quirks you'll need to handle.

Playground

Hint 1

The data is formatted as repeating triplets: person, song, artist, person, song, artist, .... Split the string by ", " and then process every three elements together. The first is the person, the second is the song, the third is the artist.

Hint 2

Use two objects: one to count how many times each song was played (songCounts), and another to track who listened to each song and how many times (songListeners). After one pass through the data, you'll have everything you need.

Hint 3

Before splitting, clean the data: remove stray quotes with .replace(/'/g, '') and trim trailing whitespace with .trim(). The last entry has 'Radiohead' with quotes around it, and there's trailing whitespace at the end.

Solution

Explanation

This is a data-parsing question. Before you write any algorithm, you need to understand the shape of the data. That's actually the hardest part.

Reading the data

The string looks like a random list of names, but it's structured as repeating triplets: person, song, artist. So "Martin, By your name, NasX" means Martin listened to "By your name" by NasX. "Kie, Jigsaw, Radiohead" means Kie listened to "Jigsaw" by Radiohead. And so on.

Once you see the triplet pattern, the problem becomes much more approachable.

Cleaning the data

Before splitting, we clean up two issues in the raw string:

const elements = data.replace(/'/g, '').trim().split(', ');

The last artist name has quotes around it ('Radiohead'), so we strip all single quotes. There's also trailing whitespace at the end of the string, so we trim() it. Then we split by ", " to get an array of individual elements.

Building the lookup tables

We walk through the array in steps of 3, pulling out the person and song from each triplet:

for (let i = 0; i < elements.length; i += 3) {
const person = elements[i];
const song = elements[i + 1];

songCounts[song] = (songCounts[song] || 0) + 1;

if (!songListeners[song]) {
songListeners[song] = {};
}
songListeners[song][person] =
(songListeners[song][person] || 0) + 1;
}

We're building two objects in a single pass:

  • songCounts tracks how many times each song was played. After processing, it looks something like { "Jigsaw": 2, "Own worst enemy": 2, "By your name": 1, ... }.
  • songListeners tracks who listened to each song and how many times. It's a nested object: { "Jigsaw": { "Kie": 1, "Chiaki": 1 }, "Own worst enemy": { "Chris": 1, "Berta": 1 }, ... }.

Finding the most popular song

With songCounts built, finding the most popular song is a simple loop that tracks the maximum:

for (const song in songCounts) {
if (songCounts[song] > maxPlays) {
maxPlays = songCounts[song];
mostPlayedSong = song;
}
}

Finding the top listener

getTopListener does the same thing but on a per-song basis. Given a song name, it looks up that song's listeners in songListeners and finds the person with the highest count.

What makes this question tricky

The algorithm itself is straightforward (count things, find the max). What trips people up is the parsing. If you don't realize the data is in triplets, you'll try to treat every comma-separated value as a song name and get completely lost. Take a moment to read the data carefully before coding. That's the real test.

You'll notice "Jigsw" in the data (missing the 'a'). That's not a typo you need to fix. It's a different song entry. The data comes from user input, and in the real world, data is messy. The interviewer might ask you to discuss how you'd handle fuzzy matching or normalization as a follow-up.