Update: the original Shiny instance went down due to server load soon
after release. It's now up again at http://datavis.wmflabs.org/where/
on a dedicated Labs machine, where we hope to put...many more
visualisations. It also now has mapping, largely thanks to Sarah
Laplante (http://sarahlaplante.com/), and soon it will hopefully be
/non-hideous/ mapping (the current mass of blue and grey is because my
aesthetic tastes are...I don't actually have any aesthetic tastes)

On 2 March 2015 at 22:36, Oliver Keyes <oke...@wikimedia.org> wrote:
> Indeed! Orienting it that way (pivoting on language rather than
> project) is something several people have asked for; I plan to spend a
> chunk of my spare time (that is, recreational time) trying to make it
> work. Should be fairly trivial.
>
> On 2 March 2015 at 09:55, h <hant...@gmail.com> wrote:
>> Hello Finn,
>>    I do not have a specific answer to your question. However, it might be
>> worthwhile to add Finnish in to the comparison as according to the CLDR 26
>> T-L information
>> http://www.unicode.org/cldr/charts/26/supplemental/territory_language_information.html
>>
>>    You have some sizable Finnish language speakers in Sweden:
>>
>> Swedish {O} sv 95.0% 99.0%
>> Finnish {OR} fi 2.2%
>>
>>     So if the similar query is executed on Finnish language, and the results
>> also show some "undue" proportion of visits from Sweden, then what you
>> observed as anomaly is the that unique. We probably need many iterations of
>> comparative outcomes and normalization of data (Sweden does have higher
>> population).  Also, it might be handy to have some statistics on immigration
>> or residence, it is EU. I will not be surprised that for example the  visits
>> from Oxford to Wikipedia website have sizable German language requests.
>>
>>     I am still a bit bothered by the number "1" in the current dataset. It
>> does not feel right since the numbers of 1.4% and 0.6% is a notable
>> difference in this regard. Perhaps we need some high precision "universal
>> percentage" number for each territory-language pair. It would be also great
>> to do another set of aggregation: i.e. given a territory, which language
>> versions of Wikipedia are accessed....
>>
>> Best,
>> han-teng liao
>>
>> 2015-03-02 13:54 GMT+01:00 Finn Årup Nielsen <f...@imm.dtu.dk>:
>>>
>>> Hi Oliver,
>>>
>>>
>>> Interesting dataset! I am curious about why the Danish Wikipedia is so
>>> highly acccessed from Sweden. Could it be an error, e.g., with Telia
>>> IP-numbers?
>>>
>>> In Python:
>>>
>>> >>> import pandas as pd
>>> >>> df =
>>> >>> pd.read_csv('http://files.figshare.com/1923822/language_pageviews_per_country.tsv',
>>> >>> sep='\t')
>>> >>> df.ix[df.project == 'da.wikipedia.org', ['country',
>>> >>> 'pageviews_percentage']].set_index('country') pageviews_percentage
>>> country
>>> Austria                            1
>>> China                              1
>>> Denmark                           61
>>> Estonia                            1
>>> France                             1
>>> Germany                            2
>>> Netherlands                        2
>>> Norway                             1
>>> Sweden                            18
>>> United Kingdom                     3
>>> United States                      3
>>> Other                              5
>>>
>>>
>>> MaxMind has some numbers on their own accuracy:
>>>
>>> https://www.maxmind.com/en/geoip2-city-database-accuracy
>>>
>>> For Denmark 85% is "Correctly Resolved", for Sweden only 68%. I wonder if
>>> this really could bias the result so much.
>>>
>>> If the numbers are correct why would the Swedish read the Danish Wikipedia
>>> so much? Bots? It does not apply the other way around: Only 2% of the
>>> traffic to Swedish Wikipedia comes from Denmark.
>>>
>>>
>>>
>>> best regards
>>> Finn
>>>
>>>
>>>
>>> On 02/25/2015 10:06 PM, Oliver Keyes wrote:
>>>>
>>>> Hey all!
>>>>
>>>> We've released a highly-aggregated dataset of readership data -
>>>> specifically, data about where, geographically, traffic to each of our
>>>> projects (and all of our projects) comes from. The data can be found
>>>> at http://dx.doi.org/10.6084/m9.figshare.1317408 - additionally, I've
>>>> put together an exploration tool for it at
>>>> https://ironholds.shinyapps.io/WhereInTheWorldIsWikipedia/
>>>>
>>>> Hope it's useful to people!
>>>>
>>>
>>>
>>> --
>>> Finn Årup Nielsen
>>> http://people.compute.dtu.dk/faan/
>>>
>>>
>>> _______________________________________________
>>> Wiki-research-l mailing list
>>> Wiki-research-l@lists.wikimedia.org
>>> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
>>
>>
>>
>> _______________________________________________
>> Wiki-research-l mailing list
>> Wiki-research-l@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
>>
>
>
>
> --
> Oliver Keyes
> Research Analyst
> Wikimedia Foundation



-- 
Oliver Keyes
Research Analyst
Wikimedia Foundation

_______________________________________________
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l

Reply via email to