Thanks for this, Erik. This can be helpful for a variety of projects
including
https://meta.wikimedia.org/wiki/Research:Characterizing_Wikipedia_Reader_Behaviour/Robustness_across_languages
and the next steps for this project.

L

On Wednesday, July 11, 2018, Erik Zachte <ezac...@wikimedia.org> wrote:

>  Today I released two new json files [2][4].
> Both complement visualization 'Wikipedia Views Visualized' [1] (aka
> WiViVi), but both can be useful in other contexts as well.
> 1) File 'demographics_from_world_bank_for_wikimedia.json' [2] resulted
> from
> harvesting World Bank API files.
> It contains yearly figures for four metrics: (more could be added rather
> easily):
> - population counts,
> - percentage internet users,
> - percentage mobile subscriptions,
> - GDP per capita.
> The following static demographics charts on meta are also based on these
> metrics: [3]
> 2) File 'datamaps-data.json' [4] contains the equivalent of 3 rather
> complex (*) csv files which feed WiViVi. This brings together demographics
> data and pageviews (by country, by region, and by language), and also adds
> additional meta info. This json file is meant for external use, as it's
> much easier to parse than the 3 csv files WiViVi uses itself [5].
> (*) complex , as the csv files use a hierarchy based on nested delimiters
> --
> Details:
> World Bank files have different formats (some csv, some json) and use a
> variety of indexes (some use ISO 3166-1 alpha-2 codes, others ..-alpha-3).
> Script 1) first does normalization, then data are aggregated, filtered,
> indexed.
> Json file 1) replaces two csv files which up to now were filled from
> Wikipedia pages [6][7].
> Also, although Wikipedia lists nowadays also use World Bank data, this is
> not consistently done, see [8][9].
> [1] Viz:
> https://stats.wikimedia.org/wikimedia/animations/wivivi/wivivi.html
> [2] Json:
> https://stats.wikimedia.org/wikimedia/animations/wivivi/
> world-bank-demographics.json
>     Script:
> https://github.com/wikimedia/analytics-wikistats/tree/master/worldbank
> [3] Charts: https://meta.wikimedia.org/wiki/World_Bank_demographics
> [4] Json:
> https://stats.wikimedia.org/wikimedia/animations/wivivi/datamaps-data.json
>     Script:
> https://github.com/wikimedia/analytics-wikistats/tree/master/traffic
> [5] Syntax:
> https://stats.wikimedia.org/wikimedia/animations/wivivi/data.html
> [6] Article:
> https://en.wikipedia.org/wiki/List_of_countries_and_
> dependencies_by_population
> [7] Article:
> https://en.wikipedia.org/wiki/List_of_countries_by_number_
> of_Internet_users
> [8] Talk page: https://bit.ly/2L5Z2P4 section 'Wikipedia vs Worldbank
> population counts'
> [9] Talk page: https://bit.ly/2NJUoIu section 'Wikipedia vs Worldbank
> internet percentages'
> _______________________________________________
> Wiki-research-l mailing list
> Wiki-research-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
>


-- 

--
Leila Zia
Senior Research Scientist
Wikimedia Foundation
_______________________________________________
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l

Reply via email to