Hi Willy, (Forwarding your question to the public analytics list for others who might know more.)
> Do you have any data that shows how many times audio files were downloaded in 2022? I think your best bet is the Mediacounts dataset <https://wikitech.wikimedia.org/wiki/Analytics/Data_Lake/Traffic/Mediacounts>, which is available in a public API <https://wikitech.wikimedia.org/wiki/Analytics/AQS/Mediarequests>. E.g., to get # requested of audio downloads in 2022: https://wikimedia.org/api/rest_v1/metrics/mediarequests/aggregate/all-referers/audio/all-agents/monthly/20220101/20221231 However, it doesn't look like data transfer details are available in the Public API. The backing dataset in Hive does have a total_response_size field so you could probably get this info more specifically by querying for it in Hive. Good luck! On Wed, Feb 1, 2023 at 7:11 PM Willy Pao <w...@wikimedia.org> wrote: > Hey Andrew - hope all is going well. I've been working on gathering some > data for Wikimedia's Annual Sustainability Report, and there was a question > that Deb sent over regarding the usage of Audio files. With Jaime's help > from Data Persistence SRE, we were able to figure out some of the numbers > around storage and energy consumption. There was one part I was hoping you > (or someone from your team) might be able to help with though. Do you have > any data that shows how many times audio files were downloaded in 2022? > Much appreciated in advance. > > Thanks, > Willy > > ---------- Forwarded message --------- > From: Deb Tankersley <dtankers...@wikimedia.org> > Date: Mon, Jan 30, 2023 at 1:41 PM > Subject: energy used to store > To: Willy Pao <w...@wikimedia.org>, Erin Morris <emor...@wikimedia.org>, > Cassie Casares <ccasa...@wikimedia.org> > > > Hey Willy! > > I got an interesting question (bolded below) from Wikimedia Sweden on the > energy that we use to store and serve audio files. Here's their full > comment / question: > > *"As part of my yearly planning for 2023, we are conducting a study >> regarding digitization of audio tapes, which climate footprints the various >> stages in the process generate and whether some of these can be made more >> energy efficient. We have limited the study to audio tapes, because it is a >> prioritized material category and a very data-intensive business, and >> because the limitation hopefully gives us relatively accurate numbers. >> Since we have been publishing digital audio originally from audio tapes on >> Wikimedia Commons for the past few years, I was wondering if there are any >> statistics related to energy consumption and carbon dioxide emissions >> available?* >> >> >> *What we would like to know is how much energy is required in the year >> 2022 to store our total amount of uploaded audio files (with the exception >> of Karl Tirén's phonograph recordings), how many times they have been >> downloaded and how large a total amount of data is involved. We suspect >> that downloading the high-resolution audio files is also relatively data >> intensive. As mentioned, the goal is not to stop this activity, or even >> reduce it without seeing how it looks and then investigating whether there >> are any links in the chain that can be tweaked to possibly reduce the >> climate impact. If numbers cannot be obtained, this is also valuable >> information."* >> > > > I'm not sure if we can narrow down this enough to get them a decent / > solid answer. What are your thoughts? > > > Thanks, > > > Deb > > -- > > deb tankersley (she/her) > > senior program manager, engineering > > Wikimedia Foundation > > > > >
_______________________________________________ Analytics mailing list -- analytics@lists.wikimedia.org To unsubscribe send an email to analytics-le...@lists.wikimedia.org