Hello Noah,
Thank you for reaching out to us :)
The reason for which we have not backfilled the "top pageview per country"
data is because, to secure privacy of our users, we use a filtering
mechanism to remove pages that have been seen by less than 1000 actors a
day, and that the data allowing us to do so is kept only for 90 days.
I have just created a task in our phabricator board for us to investigate
other filtering methods that could allow us to release historical data,
even if less detailed (https://phabricator.wikimedia.org/T299627).
Sorry for not being able to help and best of luck for your studies :)

Joseph for the Data Engineering (ex-Analytics) team

On Tue, Jan 18, 2022 at 1:33 AM Noah Brunken Syrkis <n...@itu.dk> wrote:

> Hello,
>
>
> I noticed that the public api for daily top viewed pages per country[1]
> only goes back to Jan 1st, 2021. Could this be backfilled from other
> datasets to 2015, without too much effort on Your part? The research team
> encouraged me to ask here, when I spoke with them about my need for the
> data—I'm a data science student at the IT University of Copenhagen doing a
> thesis on predicting country level human value survey responses[2] based on
> the top read Wikipedia pages in the given country.
>
>
> Thanks!
> Noah
>
>
> [1]
> https://wikimedia.org/api/rest_v1/#/Pageviews%20data/get_metrics_pageviews_top_per_country__country___access___year___month___day_
>
> [2] http://www.europeansocialsurvey.org/downloadwizard/
> _______________________________________________
> Analytics mailing list -- analytics@lists.wikimedia.org
> To unsubscribe send an email to analytics-le...@lists.wikimedia.org
>


-- 
Joseph Allemandou (joal) (he / him)
Staff Data Engineer
Wikimedia Foundation
_______________________________________________
Analytics mailing list -- analytics@lists.wikimedia.org
To unsubscribe send an email to analytics-le...@lists.wikimedia.org

Reply via email to