Thanks for this, Erik. This can be helpful for a variety of projects including https://meta.wikimedia.org/wiki/Research:Characterizing_Wikipedia_Reader_Behaviour/Robustness_across_languages and the next steps for this project.
L On Wednesday, July 11, 2018, Erik Zachte <ezac...@wikimedia.org> wrote: > Today I released two new json files [2][4]. > Both complement visualization 'Wikipedia Views Visualized' [1] (aka > WiViVi), but both can be useful in other contexts as well. > 1) File 'demographics_from_world_bank_for_wikimedia.json' [2] resulted > from > harvesting World Bank API files. > It contains yearly figures for four metrics: (more could be added rather > easily): > - population counts, > - percentage internet users, > - percentage mobile subscriptions, > - GDP per capita. > The following static demographics charts on meta are also based on these > metrics: [3] > 2) File 'datamaps-data.json' [4] contains the equivalent of 3 rather > complex (*) csv files which feed WiViVi. This brings together demographics > data and pageviews (by country, by region, and by language), and also adds > additional meta info. This json file is meant for external use, as it's > much easier to parse than the 3 csv files WiViVi uses itself [5]. > (*) complex , as the csv files use a hierarchy based on nested delimiters > -- > Details: > World Bank files have different formats (some csv, some json) and use a > variety of indexes (some use ISO 3166-1 alpha-2 codes, others ..-alpha-3). > Script 1) first does normalization, then data are aggregated, filtered, > indexed. > Json file 1) replaces two csv files which up to now were filled from > Wikipedia pages [6][7]. > Also, although Wikipedia lists nowadays also use World Bank data, this is > not consistently done, see [8][9]. > [1] Viz: > https://stats.wikimedia.org/wikimedia/animations/wivivi/wivivi.html > [2] Json: > https://stats.wikimedia.org/wikimedia/animations/wivivi/ > world-bank-demographics.json > Script: > https://github.com/wikimedia/analytics-wikistats/tree/master/worldbank > [3] Charts: https://meta.wikimedia.org/wiki/World_Bank_demographics > [4] Json: > https://stats.wikimedia.org/wikimedia/animations/wivivi/datamaps-data.json > Script: > https://github.com/wikimedia/analytics-wikistats/tree/master/traffic > [5] Syntax: > https://stats.wikimedia.org/wikimedia/animations/wivivi/data.html > [6] Article: > https://en.wikipedia.org/wiki/List_of_countries_and_ > dependencies_by_population > [7] Article: > https://en.wikipedia.org/wiki/List_of_countries_by_number_ > of_Internet_users > [8] Talk page: https://bit.ly/2L5Z2P4 section 'Wikipedia vs Worldbank > population counts' > [9] Talk page: https://bit.ly/2NJUoIu section 'Wikipedia vs Worldbank > internet percentages' > _______________________________________________ > Wiki-research-l mailing list > Wiki-research-l@lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/wiki-research-l > -- -- Leila Zia Senior Research Scientist Wikimedia Foundation _______________________________________________ Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l