Hi Willy,
(Forwarding your question to the public analytics list for others who might
know more.)

> Do you have any data that shows how many times audio files were
downloaded in 2022?

I think your best bet is the Mediacounts dataset
<https://wikitech.wikimedia.org/wiki/Analytics/Data_Lake/Traffic/Mediacounts>,
which is available in a public API
<https://wikitech.wikimedia.org/wiki/Analytics/AQS/Mediarequests>.  E.g.,
to get #  requested of audio downloads in 2022:
https://wikimedia.org/api/rest_v1/metrics/mediarequests/aggregate/all-referers/audio/all-agents/monthly/20220101/20221231

However, it doesn't look like data transfer details are available in the
Public API.  The backing dataset in Hive does have a total_response_size field
so you could probably get this info more specifically by querying for it in
Hive.

Good luck!

On Wed, Feb 1, 2023 at 7:11 PM Willy Pao <w...@wikimedia.org> wrote:

> Hey Andrew - hope all is going well.  I've been working on gathering some
> data for Wikimedia's Annual Sustainability Report, and there was a question
> that Deb sent over regarding the usage of Audio files.  With Jaime's help
> from Data Persistence SRE, we were able to figure out some of the numbers
> around storage and energy consumption.  There was one part I was hoping you
> (or someone from your team) might be able to help with though.  Do you have
> any data that shows how many times audio files were downloaded in 2022?
> Much appreciated in advance.
>
> Thanks,
> Willy
>
> ---------- Forwarded message ---------
> From: Deb Tankersley <dtankers...@wikimedia.org>
> Date: Mon, Jan 30, 2023 at 1:41 PM
> Subject: energy used to store
> To: Willy Pao <w...@wikimedia.org>, Erin Morris <emor...@wikimedia.org>,
> Cassie Casares <ccasa...@wikimedia.org>
>
>
> Hey Willy!
>
> I got an interesting question (bolded below) from Wikimedia Sweden on the
> energy that we use to store and serve audio files. Here's their full
> comment / question:
>
> *"As part of my yearly planning for 2023, we are conducting a study
>> regarding digitization of audio tapes, which climate footprints the various
>> stages in the process generate and whether some of these can be made more
>> energy efficient. We have limited the study to audio tapes, because it is a
>> prioritized material category and a very data-intensive business, and
>> because the limitation hopefully gives us relatively accurate numbers.
>> Since we have been publishing digital audio originally from audio tapes on
>> Wikimedia Commons for the past few years, I was wondering if there are any
>> statistics related to energy consumption and carbon dioxide emissions
>> available?*
>>
>>
>> *What we would like to know is how much energy is required in the year
>> 2022 to store our total amount of uploaded audio files (with the exception
>> of Karl Tirén's phonograph recordings), how many times they have been
>> downloaded and how large a total amount of data is involved. We suspect
>> that downloading the high-resolution audio files is also relatively data
>> intensive. As mentioned, the goal is not to stop this activity, or even
>> reduce it without seeing how it looks and then investigating whether there
>> are any links in the chain that can be tweaked to possibly reduce the
>> climate impact. If numbers cannot be obtained, this is also valuable
>> information."*
>>
>
>
> I'm not sure if we can narrow down this enough to get them a decent /
> solid answer. What are your thoughts?
>
>
> Thanks,
>
>
> Deb
>
> --
>
> deb tankersley (she/her)
>
> senior program manager, engineering
>
> Wikimedia Foundation
>
>
>
>
>
_______________________________________________
Analytics mailing list -- analytics@lists.wikimedia.org
To unsubscribe send an email to analytics-le...@lists.wikimedia.org

Reply via email to