Hello fellow podcasters! I’m writing with some good news about an important listener analytics update. This newsletter is a bit longer than usual, but the upcoming changes are worth a more detailed discussion, and I hope you’ll stick around and read the whole post. But just in case you’re too busy …
TL;DR: Fireside’s stats are becoming more accurate, comprehensive, and conservative as we work to improve podcast listener analytics. This change is in line with the current trend, pushing for more refined download tracking. Some Fireside podcasters will see a decrease in download numbers as we implement more conservative guidelines for listener metrics, in line with new industry standards.
Keep reading for the details!
I started podcasting as a hobby in 2006, and went full-time in 2008 with the launch of 5by5.tv. Back in those days, tracking downloads and understanding listeners was incredibly difficult, much more so than today. There were few podcast hosting companies, and the ones that existed didn’t offer any real insight. So when it came time to launch 5by5, I built my own tracking system to support it. Over the last 11+ years, we’ve worked hard at improving and refining this system. And today, we’re going to be introducing some of the biggest changes yet. More on that in a bit.
The way trackers work is simple: when a podcast client (like Apple Podcasts, Overcast, Pocket Casts, etc.) requests an MP3 file, the tracker receives the request, logs the details (which MP3 file the client wants, their public IP address, the name of the app, and similar, public data), and then redirects the client (using a 302 redirect, since you asked) to the file’s origin, typically on a super-fast CDN. Then we tally-up those records and we have a perfect, complete download picture. Right?
Unfortunately, it’s not that simple. Back in 2008, I was surprised by the numbers I was seeing, and it didn’t seem possible that my podcasts could be getting such huge downloads. Research revealed that I wasn’t accounting for Byte-Range Requests (or range requests for short).
If you’re curious, here’s a very detailed explanation of range requests, but here’s a simpler explanation:
When a podcast client requests an MP3 file, it actually makes several requests – sometimes as many as 5 or even 10 – asking for the file to be split up into chunks (based on a specified range, hence the name range request) that can be downloaded simultaneously, in order to take advantage of our modern-day, high-bandwidth internet connections to reduce the time it takes to download the file. Those chunks are then reassembled by the podcast client into a single MP3, ready to be played. All of this happens instantly and behind the scenes. This is awesome because we get the file faster than we would with just one request, but it presents a problem for download tracking because each of those requests is counted separately by trackers, which have no easy way of identifying a single client.
In other words: one “download” of your MP3 file could potentially look like five or more separate requests.
It’s therefore necessary to examine every request and determine if it is an initial request or one of the (potentially many) additional requests for the file. To further complicate issues, some clients send every request through the tracker, while others only send the first request, get the redirect, and then talk only to the CDN for delivery of the file. It’s this kind of inconsistency that we see all across collecting and analyzing podcast download metrics.
As with any industry, the early years – and we are still very much in the early years of podcasting – are the Wild West. There’s no standard for anything, but as time goes on, standards begin to emerge.
One of the ways podcast hosts and analytics companies have tried to mitigate the issue of range requests is by counting only the very first request for a file from a specific IP address and podcast client, and throwing out every subsequent request coming from that IP address and client for a period of time (somewhere between 1 and 24 hours).
At first, this technique seems solid, but there are edge cases. For example, consider a scenario where there are many people, such as those at a large company, who all use the same podcast client. Let’s pretend that across the company, there are 100 people using Overcast to subscribe and listen to your podcast. When your new episode drops, each of those apps gets an update simultaneously and begins downloading it at the same exact time. They’re all using the same client and accessing the Internet through the same gateway at their company … meaning they all have the same IP address and the same client identifier.
This means only one of those requests – the very first one – would be counted as a download. The other 99 peoples’ downloads would be thrown out. The upside is you wouldn’t see hundreds of falsely reported downloads due to range requests. The downside is that you’d see only 1 download instead of the true 100 downloads.
There are ways to help mitigate this problem, though, like using intelligently built smart whitelists, machine learning, human intervention, etc. But you get the general gist of the problem, and why it’s so tricky to accurately and effectively understand and track true listener downloads.
And even those methods don’t solve all the problems. Here’s another one: in order to really be sure of a download, it’s recommended to verify that at the ID3 tag (where the metadata and cover image is stored) plus enough of the podcast content to play for one minute has been downloaded. Unfortunately, many podcast apps fail to send all of their requests through hosting companies’ trackers, or fail to do it consistently, meaning only the first request is recorded, and despite being valid, would also have to be thrown out!
Every podcast hosting company and analytics service has had their own methods and algorithms for solving these issues and for calculating podcast downloads. Unfortunately, until very recently, there’s been little or no collaboration, discussion, or consensus. This has created much confusion in the industry, for podcasters, advertisers, and the companies that support them.
We are all learning together, as a community.
Our goal at Fireside has always been to provide a wonderful podcast hosting experience. This includes presenting as accurate a picture of your listeners as possible. In the past, Fireside has been on the more forgiving side when tracking downloads, using a variety of techniques to help determine uniques, and filtering out bogus requests, bots, duplicate requests, etc. in the best and most fair way possible.
Recently, due primarily to the emergence of IAB’s Podcast Measurement Guidelines, there’s been a strong push across the industry to be significantly more conservative about how downloads are analyzed, and we are finally seeing a consensus regarding podcast download measurement. As such, we will be moving in this direction along with the rest of the industry, toward more conservative download tracking algorithms.
This change means that your Fireside metrics will be much more accurate and valuable, and will represent an even more clear picture of your listeners, while also eliminating false downloads from bots, scrapers, and other abusive services. It is likely, though, that some Fireside users will see a reduction in their download numbers as we move closer to IAB compliance.
As the industry as a whole moves quickly in this direction, your Fireside metrics will align themselves more closely with other analytics services such as Chartable, as well as those from other hosting providers who are implementing these new standards for download measurement as well.
You can expect to see changes in Metrics starting on Friday, July 5th. If you’re an existing customer, this change will only be visible for new downloads occurring after July 5th. If you’ve created a new podcast on or after that date, you’ll see the new numbers from the start.
Additionally, the new standard follows all of the same privacy guidelines we’ve had since the beginning. Nothing has or will change regarding privacy – Fireside doesn’t collect or store any specific information about your listeners, other than publicly visible details such as IP address, and user agents identifiers (like device type and client application).
We’ve spent many, many hours working hard to improve the accuracy and intelligence of Fireside’s metrics, and we’re still not done (and there will always be room to grow and improve). It is my sincere hope that these improvements will be valuable tools to help you better understand your listeners, and aid in your growth as a podcaster.
If you have questions or concerns, please do not reply to this newsletter, as we will not receive it. Instead, visit our support page and submit a ticket, and we will respond as quickly as possible.
Thank you for being a Fireside customer, and happy podcasting!
Dan Benjamin
Founder, Fireside.fm