Re: [Foundation-l] Where do our readers come from? QA

2010-01-18 Thread Joan Goma
There are 3 phenomena acting simultaneously against the number of visits to
small projects: The bilingual effect, the size effect, and the Google
effect. For Catalan case we estimate a penalization factor of 8.3 (that
means that visits are 8.3 times less that what they should be). It comes
from: 1.2 bilingual factor (visits lost because people also understand other
languages, even if they have the opportunity to read the article in their
mother tongue, they also read it in others). 2.5 size factor (visits to
other projects because readers don’t find what they were looking for in
their mother tongue). And 2,77 Google factor. (Visits lost because Google
directs people to other tongues projects). The only positive factor is the
bilingual one. We are working hard to correct the others. For other projects
those factors can be very different but the concept can be there.


 Date: Sat, 16 Jan 2010 02:40:06 -0700
 From: Mark Williamson node...@gmail.com
 Subject: Re: [Foundation-l] Where do our readers come from? QA
 To: Wikimedia Foundation Mailing List
foundation-l@lists.wikimedia.org
 Message-ID:
849f98ed1001160140h20c69f6fxa5a7a22d4b81e...@mail.gmail.com
 Content-Type: text/plain; charset=ISO-8859-1

 Sociolinguistic situations around the world are very complex I think. In
 especially former European colonies, of which Kenya is but one example, the
 language of the former colonial power often has a unique position in
 society.

 It is not surprising to me that the English Wikipedia is so popular
 compared
 to any other in Kenya, but it is quite a bit more surprising that Korean,
 Romanian, Bulgarian, Lithuanian, Iranian, etc. users prefer the English
 Wikipedia.

 Mark

 On Sat, Jan 16, 2010 at 2:25 AM, Ziko van Dijk zvand...@googlemail.com
 wrote:

  Dear Erik,
 
  Maybe there is a dirty Polish word looked up by many Polish pupils,
  and when they Google it they come to eu.WP because a Basque word
  accidentally is alike? :-)
 
  I am looking now for the interest in the native / the English
  Wikipedia in specific countries. It might be important how localized
  the software in general is. If you live in, say, Kenya, and your
  computer has Windows in English, the Internet Explorer and everything
  is oriented to English, and you google your home town in an English
  language Google, it is probable that you will get the Wikipedia
  article in English and not in Swahili.
 
  Kind regards
  Ziko

___
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


Re: [Foundation-l] Where do our readers come from? QA

2010-01-18 Thread Marcus Buck
Joan Goma hett schreven:
 There are 3 phenomena acting simultaneously against the number of visits to
 small projects: The bilingual effect, the size effect, and the Google
 effect. For Catalan case we estimate a penalization factor of 8.3 (that
 means that visits are 8.3 times less that what they should be). It comes
 from: 1.2 bilingual factor (visits lost because people also understand other
 languages, even if they have the opportunity to read the article in their
 mother tongue, they also read it in others). 2.5 size factor (visits to
 other projects because readers don’t find what they were looking for in
 their mother tongue). And 2,77 Google factor. (Visits lost because Google
 directs people to other tongues projects). The only positive factor is the
 bilingual one. We are working hard to correct the others. For other projects
 those factors can be very different but the concept can be there.
   
Interesting. What's the math behind that numbers? Or the source?

Marcus Buck

___
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


Re: [Foundation-l] Where do our readers come from? QA

2010-01-18 Thread William Pietri
On 01/18/2010 09:29 AM, Joan Goma wrote:
 There are 3 phenomena acting simultaneously against the number of visits to
 small projects: The bilingual effect, the size effect, and the Google
 effect. For Catalan case we estimate a penalization factor of 8.3 (that
 means that visits are 8.3 times less that what they should be).


In the long term, it seems like we could compensate for all of these 
effects in software.

I'm imagining a user experience where we make it easy for multilingual 
users to switch back and forth. That would include passive detection of 
multilingual users, hinting when good content is available in other 
languages, and making it easy for multilingual users to help translate 
content. It might also be worth looking at URL schemes that are not 100% 
language-specific, to focus the Google effect more usefully.

That would require a lot of technical work, and would raise a number of 
non-technical issues, but I don't see any insurmountable barriers to a 
more fluid experience for multilingual users.

William

___
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


Re: [Foundation-l] Where do our readers come from? QA

2010-01-18 Thread Joan Goma
Details on how to measure it are relatively complex. We can make a guess
because of data collected from sources available for Catalan. My mail was
just to explain the phenomena.



Figures results from: a) Surveys. Last one answered by 400 Catalan Wikipedia
readers. We use results from answer to question about other language
versions frequently used. [1]  b) Most viewed pages in Spanish, French and
English not yet existing in Catalan.[2] c) % of visitors to web pages
exclusively in Catalan using web browser configured in other languages [3].
D) Own experiments with common searches in Google configuring the browser in
Catalan, French, Spanish, and English, and some final cooking. Result is
very approximate but gives us an idea about what is happening.



The bilingual factor is not negative. It apparently reduces hits to Catalan
pages but really it increases hits to non Catalan pages.



The factor due to inexistent or not well developed articles has to be
improved by growing the project.



The more frustrating one is the Google Factor, You can Google “Integral”
even with a Catalan configured navigator and you will get  the English
version first, then the Spanish one (witch is a translation from an old
Catalan version) both in first page but not find the Catalan one witch is
the larger of all before page 10. This article is a very special case due to
specific factors.



A technical solution would be great. And perhaps it is not of high
difficulty. We could guess languages from IP address and highlight interwiki
links to those languages.



[1]
http://ca.wikipedia.org/wiki/Viquip%C3%A8dia:Segon_sondeig_dels_usuaris/4._Utilitzeu_amb_freq%C3%BC%C3%A8ncia_alguna_altra_edici%C3%B3_de_la_Viquip%C3%A8dia%3F

[2]
http://ca.wikipedia.org/wiki/Usuari:Meldor/Top_visites_2009#Mes_visitats_a_can_.28castell.C3.A0.29_que_no_tenen_link_al_catal.C3.A0

[3] http://www.eines.cat/?p=804


 From: Marcus Buck m...@marcusbuck.org

 Joan Goma hett schreven:
  There are 3 phenomena acting simultaneously against the number of visits
 to
  small projects: The bilingual effect, the size effect, and the Google
  effect. For Catalan case we estimate a penalization factor of 8.3 (that
  means that visits are 8.3 times less that what they should be). It comes
  from: 1.2 bilingual factor (visits lost because people also understand
 other
  languages, even if they have the opportunity to read the article in their
  mother tongue, they also read it in others). 2.5 size factor (visits to
  other projects because readers don?t find what they were looking for in
  their mother tongue). And 2,77 Google factor. (Visits lost because Google
  directs people to other tongues projects). The only positive factor is
 the
  bilingual one. We are working hard to correct the others. For other
 projects
  those factors can be very different but the concept can be there.
 
 Interesting. What's the math behind that numbers? Or the source?

 Marcus Buck



 Date: Mon, 18 Jan 2010 11:58:20 -0800
 From: William Pietri will...@scissor.com

 On 01/18/2010 09:29 AM, Joan Goma wrote:
  There are 3 phenomena acting simultaneously against the number of visits
 to
  small projects: The bilingual effect, the size effect, and the Google
  effect. For Catalan case we estimate a penalization factor of 8.3 (that
  means that visits are 8.3 times less that what they should be).
 

 In the long term, it seems like we could compensate for all of these
 effects in software.

 I'm imagining a user experience where we make it easy for multilingual
 users to switch back and forth. That would include passive detection of
 multilingual users, hinting when good content is available in other
 languages, and making it easy for multilingual users to help translate
 content. It might also be worth looking at URL schemes that are not 100%
 language-specific, to focus the Google effect more usefully.

 That would require a lot of technical work, and would raise a number of
 non-technical issues, but I don't see any insurmountable barriers to a
 more fluid experience for multilingual users.

 William



___
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


Re: [Foundation-l] Where do our readers come from? QA

2010-01-17 Thread Nikola Smolenski
Дана Saturday 16 January 2010 12:25:58 Nikola Smolenski написа:
 Дана Saturday 16 January 2010 10:40:06 Mark Williamson написа:
  It is not surprising to me that the English Wikipedia is so popular
  compared to any other in Kenya, but it is quite a bit more surprising
  that Korean, Romanian, Bulgarian, Lithuanian, Iranian, etc. users prefer
  the English Wikipedia.

 Next thing to do: Wikipedia Page Views By Country - Breakdown Adjusted by
 Wikipedia Size. Erik, are you planning to do this one as well? :D

Did it: 
http://smolenski.rs/blog/2010/01/wikipedia-page-views-by-country-breakdown-with-wikipedia-size-and-quality/

___
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


Re: [Foundation-l] Where do our readers come from? QA

2010-01-16 Thread Ziko van Dijk
Dear Erik,

Maybe there is a dirty Polish word looked up by many Polish pupils,
and when they Google it they come to eu.WP because a Basque word
accidentally is alike? :-)

I am looking now for the interest in the native / the English
Wikipedia in specific countries. It might be important how localized
the software in general is. If you live in, say, Kenya, and your
computer has Windows in English, the Internet Explorer and everything
is oriented to English, and you google your home town in an English
language Google, it is probable that you will get the Wikipedia
article in English and not in Swahili.

Kind regards
Ziko


2010/1/16 Mark Williamson node...@gmail.com:
 I notice in that list both Belarusian Wikipedias are listed just as
 Belarusian Wikipedia. It would be very informative to know which is which
 and to have visitor statistics on both :-)

 skype: node.ue


 On Fri, Jan 15, 2010 at 3:39 PM, Erik Zachte erikzac...@infodisiac.comwrote:

 Here is a QA on all issues raised:
 Q=question/R=Remark, A=answer

 I put the more general questions on top.

 Cheers, Erik Zachte

 --

 Q: Nikola Smolenski
 Is it first time these reports are published?

 A:
 Yes, expect trend report to grow by accretion over time.
 Other reports will be built from data for recent (6) months only

 --

 R: Andrew Gray
 Andrew explains why distribution of page requests over countries favors
 Spanish and Portuguese speaking countries:
 'Some Wikipedias - the ones which insist on only-free-images - do not use
 local uploads at all.'

 A:
 Thanks for explaining this unexpected distribution of page views on
 Commons,
 I had no idea.

 Spain           30.0%
 USA             29.2%
 Brazil  8.5%
 Argentina       4.8%
 Mexico  3.9%
 Germany 3.3%
 France  2.1%
 Venezuela       1.9%
 Chile           1.4%
 Costa Rica      1.4%
 Italy           1.4%
 Uruguay 1.2%
 Colombia        1.2%
 Portugal        1.1%

 --

 R: Mark Williamson

 Two main factors influencing choice of Wikipedia language:
 # Fluency of the Internet-using population of a country in English.
 # Quality of the native Wikipedia.

 A:
 Like you say. Many Scandinavians (and Dutch people I might add) probably
 switch between English and local content all the time.
 Personally I tend to look at English Wp first I many instances, because of
 obviously richer content and larger depth.

 --

 Q: Ziko van Dijk
 Why are 40 % of the visitors of ksh.WP (the dialect of Cologne) from Japan.
 Why are 25 % of the visitors of eu.WP (Basque) from Poland?

 Q: Andre Engels
 I think bots are a likely explanation in the eu case
 (unless Erik is using an algorithm that filters out bots)

 A:
 KSH used to be code for Kashmir. Still not Japan, but much closer than
 Cologne.
 Maybe Japanese mountaineers caused this spike ? (only half kidding)

 As for eu.wp: Would Polish presume there also is a European Wikipedia? Just
 a guess.

 I do filter bots

 --

 R: Teun Spaans
 For trends, I would expect a bar indicating upward or downward trend, not a
 percentage bar.

 A:
 We can have both, a notion of importance and of change: I might color code
 cells as I do already in e.g. [1]
 This way large fluctuations really stand out. Let's first collect more
 history.

 [1] http://stats.wikimedia.org/EN/TablesPageViewsMonthly.htm


 --

 Q: Nikola Smolenski
 Could we get this for other projects?

 A:
 This question is of course not unexpected.
 One consideration is we need a certain sample size to make numbers
 significant.
 For other projects, with far less traffic, few country/language pairs would
 be backed by sufficient data.
 See also below on extending the current reports with more table rows.

 --

 Q: Nikola Smolenski:
 Please include at Wikipedia Page Views Per Country - Overview [1] number of
 Internet users from [2], and number of views per Internet user?

 [1] http://tinyurl.com/yk43aq6
 [2] http://tinyurl.com/yfv5bwn

 A:
 Done

 --

 R: Nikola Smolenski
 It is obvious why Slovene Wikipedia is highly visited in Sierra Leone, and
 Serbian in Suriname; URLs do matter :)
 Although, I don't understand why so much. I would expect this distribution
 by visitors, perhaps, but not by visits.

 A:
 Very interesting observation! So people from Sierra Leone try
 'sl.wikipedia.org'.
 Why people from Surinam go to 'sr.wikimedia.org' is only slightly less
 obvious to me, but apparently is happens

 For countries with just a few hits in the sampled log the distinction
 between visitors and visits gets blurred.

 --

 R: Andre Engels
 Ukrainian is not a small language by any means, yet Wikipedia visitors tend
 to be drawn to the Russian Wikipedia instead.

 

Re: [Foundation-l] Where do our readers come from? QA

2010-01-16 Thread Mark Williamson
Sociolinguistic situations around the world are very complex I think. In
especially former European colonies, of which Kenya is but one example, the
language of the former colonial power often has a unique position in
society.

It is not surprising to me that the English Wikipedia is so popular compared
to any other in Kenya, but it is quite a bit more surprising that Korean,
Romanian, Bulgarian, Lithuanian, Iranian, etc. users prefer the English
Wikipedia.

Mark

On Sat, Jan 16, 2010 at 2:25 AM, Ziko van Dijk zvand...@googlemail.comwrote:

 Dear Erik,

 Maybe there is a dirty Polish word looked up by many Polish pupils,
 and when they Google it they come to eu.WP because a Basque word
 accidentally is alike? :-)

 I am looking now for the interest in the native / the English
 Wikipedia in specific countries. It might be important how localized
 the software in general is. If you live in, say, Kenya, and your
 computer has Windows in English, the Internet Explorer and everything
 is oriented to English, and you google your home town in an English
 language Google, it is probable that you will get the Wikipedia
 article in English and not in Swahili.

 Kind regards
 Ziko


 2010/1/16 Mark Williamson node...@gmail.com:
  I notice in that list both Belarusian Wikipedias are listed just as
  Belarusian Wikipedia. It would be very informative to know which is
 which
  and to have visitor statistics on both :-)
 
  skype: node.ue
 
 
  On Fri, Jan 15, 2010 at 3:39 PM, Erik Zachte erikzac...@infodisiac.com
 wrote:
 
  Here is a QA on all issues raised:
  Q=question/R=Remark, A=answer
 
  I put the more general questions on top.
 
  Cheers, Erik Zachte
 
  --
 
  Q: Nikola Smolenski
  Is it first time these reports are published?
 
  A:
  Yes, expect trend report to grow by accretion over time.
  Other reports will be built from data for recent (6) months only
 
  --
 
  R: Andrew Gray
  Andrew explains why distribution of page requests over countries favors
  Spanish and Portuguese speaking countries:
  'Some Wikipedias - the ones which insist on only-free-images - do not
 use
  local uploads at all.'
 
  A:
  Thanks for explaining this unexpected distribution of page views on
  Commons,
  I had no idea.
 
  Spain   30.0%
  USA 29.2%
  Brazil  8.5%
  Argentina   4.8%
  Mexico  3.9%
  Germany 3.3%
  France  2.1%
  Venezuela   1.9%
  Chile   1.4%
  Costa Rica  1.4%
  Italy   1.4%
  Uruguay 1.2%
  Colombia1.2%
  Portugal1.1%
 
  --
 
  R: Mark Williamson
 
  Two main factors influencing choice of Wikipedia language:
  # Fluency of the Internet-using population of a country in English.
  # Quality of the native Wikipedia.
 
  A:
  Like you say. Many Scandinavians (and Dutch people I might add) probably
  switch between English and local content all the time.
  Personally I tend to look at English Wp first I many instances, because
 of
  obviously richer content and larger depth.
 
  --
 
  Q: Ziko van Dijk
  Why are 40 % of the visitors of ksh.WP (the dialect of Cologne) from
 Japan.
  Why are 25 % of the visitors of eu.WP (Basque) from Poland?
 
  Q: Andre Engels
  I think bots are a likely explanation in the eu case
  (unless Erik is using an algorithm that filters out bots)
 
  A:
  KSH used to be code for Kashmir. Still not Japan, but much closer than
  Cologne.
  Maybe Japanese mountaineers caused this spike ? (only half kidding)
 
  As for eu.wp: Would Polish presume there also is a European Wikipedia?
 Just
  a guess.
 
  I do filter bots
 
  --
 
  R: Teun Spaans
  For trends, I would expect a bar indicating upward or downward trend,
 not a
  percentage bar.
 
  A:
  We can have both, a notion of importance and of change: I might color
 code
  cells as I do already in e.g. [1]
  This way large fluctuations really stand out. Let's first collect more
  history.
 
  [1] http://stats.wikimedia.org/EN/TablesPageViewsMonthly.htm
 
 
  --
 
  Q: Nikola Smolenski
  Could we get this for other projects?
 
  A:
  This question is of course not unexpected.
  One consideration is we need a certain sample size to make numbers
  significant.
  For other projects, with far less traffic, few country/language pairs
 would
  be backed by sufficient data.
  See also below on extending the current reports with more table rows.
 
  --
 
  Q: Nikola Smolenski:
  Please include at Wikipedia Page Views Per Country - Overview [1] number
 of
  Internet users from [2], and number of views per Internet user?
 
  [1] http://tinyurl.com/yk43aq6
  [2] http://tinyurl.com/yfv5bwn
 
  A:
  Done
 
  --
 
  R: Nikola Smolenski
  It is obvious why Slovene Wikipedia is 

Re: [Foundation-l] Where do our readers come from? QA

2010-01-16 Thread Nikola Smolenski
Дана Saturday 16 January 2010 10:40:06 Mark Williamson написа:
 It is not surprising to me that the English Wikipedia is so popular
 compared to any other in Kenya, but it is quite a bit more surprising that
 Korean, Romanian, Bulgarian, Lithuanian, Iranian, etc. users prefer the
 English Wikipedia.

I don't think that they would prefer it, it's just that it covers much more 
topics, and generally covers the topics in much more depth.

I believe that I am fairly fluent in English, and yet I prefer to read Serbian 
Wikipedia, if I know that the topic is covered there and the article is 
better than the English one.

Next thing to do: Wikipedia Page Views By Country - Breakdown Adjusted by 
Wikipedia Size. Erik, are you planning to do this one as well? :D

___
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


Re: [Foundation-l] Where do our readers come from? QA

2010-01-16 Thread Milos Rancic
On Fri, Jan 15, 2010 at 11:39 PM, Erik Zachte erikzac...@infodisiac.com wrote:
 Q: Nikola Smolenski / Milos Rancic
 At Wikipedia Page Views By Country - Breakdown [1] and Wikipedia Page Views
 By Country - Trends [2] could you include more languages (ideally all
 languages)?
 Some of the numbers are going below 0.1% of population, but some of them are
 not mentioned even they are larger than 0.5% of population.

 [1] http://tinyurl.com/yhp3an7
 [2] http://tinyurl.com/yzga2hm

 A:
 Yes on some reports I do include smaller percentages for the largest
 Wikipedia's as those represent significant numbers of page views.
 I used different (and arbitrary) thresholds per report. The arbitrariness
 could change, but I want to plead for a notoriety threshold:

 Here is a much more extended version of the breakdown report [1] (for this
 discussion only)
 It shows per country up to 50 Wikipedia's
 An extra column shows the total number of records for this country/language
 (for the 6 month period) on which the percentage is based.
 As you can see for the smallest countries that number is so low that it is
 no longer significant.

 Let us say we cut off not at 1%, but at an (arbitrary) absolute threshold of
 x logged records per country/language pair (per row).
 Let us say we cut off at average 5 records per month. Everything below that
 threshold in the test report is in dark red.
 Personally I think this is still way too much detail for a general report.
 Not because of Kb's but information overload.

 [1] http://tinyurl.com/yjwoyre

Detailed statistics have two very important values:
* The first one is chapter-related. I want to know more details about
tendencies in Serbia, so I would be able: (1) to analyze what is going
on and what WM RS did; (2) to make a media event based on statistics.
* The other value is of general sociolinguistic value. I may trace up
to some extent where do speakers of some language live, what is the
percentage of internet adoption (actually, Wikipedia adoption); all of
that in comparison with, let's say, GDP, number of inhabitants and so
on.

It would be great if you put some periodic job which would create such
statistics at the end of every month. For example, I would really like
to know about the trends in the past 6 months.

I noticed in your quarterly report that share of Serbian language in
Serbia is raising. It is very important because it shows one (or both)
of two things: Serbian Wikipedia quality is raising or/and Internet
adoption among those who don't know English well enough is raising. If
number of visits to English Wikipedia is stable enough, it is about
the second; if number of visits is lower than previous, it is about
first; and so on.

Also, I would like to know is it seasonal: which numbers are about
tourists, and which are about general population behavior.

So, while such statistics are truly an information overload for
creation of a general report, they are very valuable for particular
reports.

___
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


Re: [Foundation-l] Where do our readers come from? QA

2010-01-16 Thread Nikola Smolenski
Дана Friday 15 January 2010 23:39:38 Erik Zachte написа:
 R: Nikola Smolenski
 It is obvious why Slovene Wikipedia is highly visited in Sierra Leone, and
 Serbian in Suriname; URLs do matter :)
 Although, I don't understand why so much. I would expect this distribution
 by visitors, perhaps, but not by visits.

 A:
 Very interesting observation! So people from Sierra Leone try
 'sl.wikipedia.org'.
 Why people from Surinam go to 'sr.wikimedia.org' is only slightly less
 obvious to me, but apparently is happens

ISO 3166-1 code for Surinam is 'sr'.

___
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


Re: [Foundation-l] Where do our readers come from? QA

2010-01-16 Thread Ronald Beelaard
I read all kind of confusions about funny correlations between language
versions and countries where visitors are coming from.

As I (privately) communicated with Erik, the following flaws are in the
current analysis:

* The country code AU is often used (by apnic in this case) as a placeholder
for ranges that are pre-reserved. For instance to allocate parts of that
very big range in bits and pieces to countries in the area (e.g. JP)
* Similarly Ripe is doing that for the country code EU (not to be confused
with the language code eu)

Other misinterpretations may occur because there are some conflicts between
country and language codes. An example of this is for instance SL (Sierra
Leone) and sl (Slovenian) and I guess UA (Ukraine) and uk (Ukrainian?) is a
similar case. But there are certainly more.
See also: http://meta.wikimedia.org/wiki/Language_codes/Conflicts, although
imo this list is not comprehensive.

Another cause of problems might be the fact that the assignments of IP
ranges continuously change. That happens on a small scale (e.g. re-assigning
a block of 65536 or much smaller), but also on a larger scale. The result is
that you can't fully trust a so-called geo-IP database (like MaxMind). I
don't know how quickly such a database is outdated, but have noticed major
shifts of ranges of more than 16 million addresses within half a year
(concerning the AU - JP confusion).
Structured lists do not exist, so the only way is continuously checking the
data in such a database via the Regional Internet Registries. That is a
complicated, but also a very time-consuming process.

So don't draw conclusions in the case of small countries and/or languages.

Rgds Ronald



___
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


Re: [Foundation-l] Where do our readers come from? QA

2010-01-16 Thread Nikola Smolenski
Дана Friday 15 January 2010 23:39:38 Erik Zachte написа:
 Here is a much more extended version of the breakdown report [1] (for this
 discussion only)
 It shows per country up to 50 Wikipedia's
 An extra column shows the total number of records for this country/language
 (for the 6 month period) on which the percentage is based.

What exactly is this number of records? Thousands of visits?

___
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


Re: [Foundation-l] Where do our readers come from?

2010-01-15 Thread Milos Rancic
On Thu, Jan 14, 2010 at 5:27 AM, Erik Zachte erikzac...@infodisiac.com wrote:
 Today I released 4 new reports, which all focus on:

 Where do our readers come from?



  http://tinyurl.com/yhdej3j http://tinyurl.com/yhdej3j



 Cheers, Erik Zachte



 ___
 foundation-l mailing list
 foundation-l@lists.wikimedia.org
 Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


Erik, may you put somewhere full statistics? Some of the numbers are
going below 0.1% of population, but some of them are not mentioned
even they are larger than 0.5% of population.

___
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


Re: [Foundation-l] Where do our readers come from? QA

2010-01-15 Thread Petr Kadlec
2010/1/15 Erik Zachte erikzac...@infodisiac.com:
 Very interesting observation! So people from Sierra Leone try
 'sl.wikipedia.org'.
 Why people from Surinam go to 'sr.wikimedia.org' is only slightly less
 obvious to me, but apparently is happens

Well, Suriname’s TLD is .sr, so it is quite obvious, isn’t it? The
same frequent mistake is also the reason there is a redirection
cz.wikipedia.org → cs.wikipedia.org (Czech language is “cs” according
to ISO 639-1, but Czech Republic’s TLD is “.cz” according to ISO
3166-1).

-- [[cs:User:Mormegil | Petr Kadlec]]

___
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


Re: [Foundation-l] Where do our readers come from? QA

2010-01-15 Thread Mark Williamson
I notice in that list both Belarusian Wikipedias are listed just as
Belarusian Wikipedia. It would be very informative to know which is which
and to have visitor statistics on both :-)

skype: node.ue


On Fri, Jan 15, 2010 at 3:39 PM, Erik Zachte erikzac...@infodisiac.comwrote:

 Here is a QA on all issues raised:
 Q=question/R=Remark, A=answer

 I put the more general questions on top.

 Cheers, Erik Zachte

 --

 Q: Nikola Smolenski
 Is it first time these reports are published?

 A:
 Yes, expect trend report to grow by accretion over time.
 Other reports will be built from data for recent (6) months only

 --

 R: Andrew Gray
 Andrew explains why distribution of page requests over countries favors
 Spanish and Portuguese speaking countries:
 'Some Wikipedias - the ones which insist on only-free-images - do not use
 local uploads at all.'

 A:
 Thanks for explaining this unexpected distribution of page views on
 Commons,
 I had no idea.

 Spain   30.0%
 USA 29.2%
 Brazil  8.5%
 Argentina   4.8%
 Mexico  3.9%
 Germany 3.3%
 France  2.1%
 Venezuela   1.9%
 Chile   1.4%
 Costa Rica  1.4%
 Italy   1.4%
 Uruguay 1.2%
 Colombia1.2%
 Portugal1.1%

 --

 R: Mark Williamson

 Two main factors influencing choice of Wikipedia language:
 # Fluency of the Internet-using population of a country in English.
 # Quality of the native Wikipedia.

 A:
 Like you say. Many Scandinavians (and Dutch people I might add) probably
 switch between English and local content all the time.
 Personally I tend to look at English Wp first I many instances, because of
 obviously richer content and larger depth.

 --

 Q: Ziko van Dijk
 Why are 40 % of the visitors of ksh.WP (the dialect of Cologne) from Japan.
 Why are 25 % of the visitors of eu.WP (Basque) from Poland?

 Q: Andre Engels
 I think bots are a likely explanation in the eu case
 (unless Erik is using an algorithm that filters out bots)

 A:
 KSH used to be code for Kashmir. Still not Japan, but much closer than
 Cologne.
 Maybe Japanese mountaineers caused this spike ? (only half kidding)

 As for eu.wp: Would Polish presume there also is a European Wikipedia? Just
 a guess.

 I do filter bots

 --

 R: Teun Spaans
 For trends, I would expect a bar indicating upward or downward trend, not a
 percentage bar.

 A:
 We can have both, a notion of importance and of change: I might color code
 cells as I do already in e.g. [1]
 This way large fluctuations really stand out. Let's first collect more
 history.

 [1] http://stats.wikimedia.org/EN/TablesPageViewsMonthly.htm


 --

 Q: Nikola Smolenski
 Could we get this for other projects?

 A:
 This question is of course not unexpected.
 One consideration is we need a certain sample size to make numbers
 significant.
 For other projects, with far less traffic, few country/language pairs would
 be backed by sufficient data.
 See also below on extending the current reports with more table rows.

 --

 Q: Nikola Smolenski:
 Please include at Wikipedia Page Views Per Country - Overview [1] number of
 Internet users from [2], and number of views per Internet user?

 [1] http://tinyurl.com/yk43aq6
 [2] http://tinyurl.com/yfv5bwn

 A:
 Done

 --

 R: Nikola Smolenski
 It is obvious why Slovene Wikipedia is highly visited in Sierra Leone, and
 Serbian in Suriname; URLs do matter :)
 Although, I don't understand why so much. I would expect this distribution
 by visitors, perhaps, but not by visits.

 A:
 Very interesting observation! So people from Sierra Leone try
 'sl.wikipedia.org'.
 Why people from Surinam go to 'sr.wikimedia.org' is only slightly less
 obvious to me, but apparently is happens

 For countries with just a few hits in the sampled log the distinction
 between visitors and visits gets blurred.

 --

 R: Andre Engels
 Ukrainian is not a small language by any means, yet Wikipedia visitors tend
 to be drawn to the Russian Wikipedia instead.

 A: Yes but article growth in Ukrainian Wikipedia has been speeding up in
 recent months. [1]

 [1] http://stats.wikimedia.org/EN/TablesWikipediaUK.htm

 --

 R: Andre Engels
 The Q3-Q4 comparison for most countries shows a shift from English to the
 'vernacular'.

 A:
 Interesting analysis. Let's see if this is a consistent trend.
 However the monthly page views per Wikipedia language for which we have 2
 year history do not show very significant shift from large to smaller
 wikipedia's.
 See table 'Distribution of page views' at bottom of page of [1]: smaller
 languages gain in share of page views, but very slowly.

 [1] 

Re: [Foundation-l] Where do our readers come from?

2010-01-14 Thread Nikola Smolenski
Erik Zachte wrote:
 Today I released 4 new reports, which all focus on: 
 
 Where do our readers come from?
 
  http://tinyurl.com/yhdej3j http://tinyurl.com/yhdej3j

Excellent and extremely useful! A big thank you! :)

A few questions:

Could we get this for other projects?

At Wikipedia Page Views Per Country - Overview, could you in future 
include number of Internet users (f.e. from 
http://en.wikipedia.org/wiki/List_of_countries_by_number_of_Internet_users 
) and number of views per Internet user? IMO, this is more useful than 
population and could identify countries where Wikipedia should be 
advertised.

At pages Wikipedia Page Views By Country - Breakdown and Wikipedia Page 
Views By Country - Trends, could you include more languages (ideally all 
languages)? Perhaps by making a separate page for every country? For 
example, I'd like to know data for all minority languages of Serbia.

It would also be interesting to somehow show this data together with 
size of the Wikipedia and number of language speakers per country but I 
don't see how exactly (and I don't know how to find the number of 
language speakers).

Perhaps I will do some of this manually, but just this time! :)

___
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


Re: [Foundation-l] Where do our readers come from?

2010-01-14 Thread Mark Williamson
Ethnologue has numbers for all languages although their information is often
outdated or not 100% accurate, it is sufficient if you're doing a list with
many languages.


On Thu, Jan 14, 2010 at 1:24 AM, Nikola Smolenski smole...@eunet.rs wrote:

 Erik Zachte wrote:
  Today I released 4 new reports, which all focus on:
 
  Where do our readers come from?
 
   http://tinyurl.com/yhdej3j http://tinyurl.com/yhdej3j

 Excellent and extremely useful! A big thank you! :)

 A few questions:

 Could we get this for other projects?

 At Wikipedia Page Views Per Country - Overview, could you in future
 include number of Internet users (f.e. from
 http://en.wikipedia.org/wiki/List_of_countries_by_number_of_Internet_users
 ) and number of views per Internet user? IMO, this is more useful than
 population and could identify countries where Wikipedia should be
 advertised.

 At pages Wikipedia Page Views By Country - Breakdown and Wikipedia Page
 Views By Country - Trends, could you include more languages (ideally all
 languages)? Perhaps by making a separate page for every country? For
 example, I'd like to know data for all minority languages of Serbia.

 It would also be interesting to somehow show this data together with
 size of the Wikipedia and number of language speakers per country but I
 don't see how exactly (and I don't know how to find the number of
 language speakers).

 Perhaps I will do some of this manually, but just this time! :)

 ___
 foundation-l mailing list
 foundation-l@lists.wikimedia.org
 Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l

___
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


Re: [Foundation-l] Where do our readers come from?

2010-01-14 Thread Nikola Smolenski
Erik Zachte wrote:
 Today I released 4 new reports, which all focus on: 
 
 Where do our readers come from?
 
  http://tinyurl.com/yhdej3j http://tinyurl.com/yhdej3j

Except for Australia-Japanese, there is also this:

Sierra Leone (0.0007% share of global total)
Russian Wp  44.9%
English Wp  43.7%
Portal  8.4%
Slovene Wp  1.1%
Other   1.9%

Why would Russian Wikipedia have so many visits from Sierra Leone?

As a sidenote, there is also this:

Suriname (0.003% share of global total)
English Wp  62.5%
Dutch Wp28.2%
Portal  4.1%
Serbian Wp  1.5%
Afrikaans Wp1.4%
Other   2.3%

It is obvious why is Slovene Wikipedia highly visited in Sierra Leone, 
and Serbian in Suriname; URLs do matter :)

(Although, I don't understand why so much. I would expect this 
distribution by visitors, perhaps, but not by visits.)

___
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


Re: [Foundation-l] Where do our readers come from?

2010-01-14 Thread Andre Engels
On Thu, Jan 14, 2010 at 5:27 AM, Erik Zachte erikzac...@infodisiac.com wrote:
 Today I released 4 new reports, which all focus on:

 Where do our readers come from?



  http://tinyurl.com/yhdej3j http://tinyurl.com/yhdej3j

Going through the countries, another remarkable result in my opinion
is the Ukraine - Ukrainian is not a small language by any means, yet
Wikipedia visitors tend to be drawn to the Russian Wikipedia instead.

Also, the Q3-Q4 comparison for most countries shows a shift from
English to the 'vernacular'. Do you have data on this from a longer
period of time? That is, is this part of an ongoing shift, or is it a
seasonal effect (perhaps having to do with Q3 containing the school
holidays in most countries?

To quantify this, I have taken the 50 largest countries, excluding
languages where English is the main language (United States, United
Kingdom, Canada, Australia, India, Philippines, Singapore, Ireland,
New Zealand, South Africa). For all countries I have compared the
percentage going to the main language Wikipedia and those going to the
English Wikipedia (in the Ukrainian case: the Russian Wikipedia), and
also the 'swing' (in the way the term is used in UK politics, see
http://en.wikipedia.org/wiki/Swing_%28United_Kingdom%29) from English
to the local language (or in the reverse direction, if it is
negative). For countries such as Spain and Belgium which have more
than one local language, the similar data with all local languages are
also given.

Japan: Japanese 92.2% over English (swing -0.4%)
Germany: German 72.2% over English (swing 1.5%)
France: French 67.5% over English (swing 4.1%)
Poland: Polish 71.5% over English (swing 4.0%)
Italy: Italian 71.5% over English (swing 4.7%)
Mexico: Spanish 71.5% over English (swing 3.4%)
Brazil: Portuguese 67.7% over English (swing 1.1%)
Spain: Spanish 60.3% over English (swing 7.0%) - vernaculars 64.4%
over English (swing 8.6%)
Netherlands: Dutch 10.4% over English (swing 6.6%)
Russia: Russian 70.2% over English (swing 4.9%)
Sweden: Swedish 13.8% over English (swing 8.1%)
Switzerland: German 36.6% over English (swing 2.1%) - vernaculars
55.0% over English (swing 2.7%)
Austria: German 65.1% over English (swing -1.1%)
Finland: Finnish 24.7% over English (swing 2.2%) - vernaculars 26.8%
over English (swing 2.8%)
China: Chinese 4.8% over English (swing -7.3%)
Turkey: Turkish 48.7% over English (swing 11.7%)
Belgium: Dutch 9.5% over English (swing 9.2%) - vernaculars 40.1% over
English (swing 9.6%)
Argentina: Spanish 66.2% over English (swing 1.2%)
Norway: Norwegian (Bokmal) 0.9% UNDER English (swing 14.4%) -
vernaculars 0.1% over English (swing 14.5%)
Colombia: Spanish 56.3% over English (swing -3.8%)
Czech Republic: Czech 44.3% over English (swing 10.2%)
Hong Kong: Chinese equal to English (swing 1.0%) - vernaculars 1.4%
over English (swing 1.2%)
Taiwan: Chinese 45.5% over English (swing 3.7%) - vernaculars 45.7%
over English (swing 3.7%)
Chile: Spanish 60.6% over English (swing -2.0%)
Israel: Hebrew 10.9% over English (swing 3.9%) - vernaculars 12.8%
over English (swing 3.9%)
Indonesia: Indonesian 10.2% over English (swing 8.5%) - vernaculars
11.3% over English (swing 8.4%)
Portugal: Portuguese 11.9% over English (swing 2.2%)
South Korea: Korean 2.7% over English (swing 12.8%)
Malaysia: Malay 74.5% UNDER English (swing -1.0%)
Peru: Spanish 74.5% over English (swing 2.1%)
Venezuela: Spanish 77.5% over English (swing 11.1%)
Ukraine: Ukrainian 56.6% UNDER RUSSIAN (swing 4.4%)
Romania: Romanian 21.7% UNDER English (swing 12.6%) - vernaculars
18.5% UNDER English (swing 13.4%)
Thailand: Thai 18.9% over English (swing -3.5%)
Denmark: Danish 12.3% UNDER English (swing 10.7%)
Hungary: Hungarian 23.8% over English (swing 6.1%)
Uruguay: Spanish 72.4% over English (swing 1.1%)
Vietnam: Vietnamese 31.0% over English (swing 8.8%)
Greece: Greek 42.1% UNDER English (swing 9.0%)
Bulgaria: Bulgarian 1.4% over English (swing 8.9%)
United Arab Emirates: Arabic 66.8% UNDER English (swing 5.4%)
Egypt: Arabic 18.5% UNDER English (swing 11.3%)
Lithuania: Lithuanian 9.3% UNDER English (swing -6.4%) - vernaculars
9.3% under English (swing -6.6%)
Iran: Persian 0.6% UNDER English (swing 0.5%)

-- 
André Engels, andreeng...@gmail.com

___
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


Re: [Foundation-l] Where do our readers come from?

2010-01-14 Thread Marco Chiesa
On Thu, Jan 14, 2010 at 10:40 AM, Andre Engels andreeng...@gmail.com wrote:

 To quantify this, I have taken the 50 largest countries, excluding
 languages where English is the main language (United States, United
 Kingdom, Canada, Australia, India, Philippines, Singapore, Ireland,
 New Zealand, South Africa). For all countries I have compared the
 percentage going to the main language Wikipedia and those going to the
 English Wikipedia (in the Ukrainian case: the Russian Wikipedia), and
 also the 'swing' (in the way the term is used in UK politics, see
 http://en.wikipedia.org/wiki/Swing_%28United_Kingdom%29) from English
 to the local language (or in the reverse direction, if it is
 negative). For countries such as Spain and Belgium which have more
 than one local language, the similar data with all local languages are
 also given.


I guess there are also a lot of cases similar to the
Australia/Japanese one of IPs wrongly attributed to one country. For
example, I remember that at least a few years ago (I'm not sure now) a
lot of Italian customers of Tele2 had an IP that was Swedish. Maybe
this is not a big effect given that the Sweden/Swedish relationship
does not differ that much from the other Scandinavian countries.
Cruccone

___
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


Re: [Foundation-l] Where do our readers come from?

2010-01-14 Thread Waerth
H what saddens me is that such a low percentage use the Thai 
wikipedia in Thailand instead of the English one.

Having lived in Thailand for over 10 years now my estimation is that 
only 10% of the populous would speak English good enough to be able to 
read English wikipedia articles at least partially. And this is the part 
of the population with the best education. This would mean that 
unfortunately Wikipedia doesn't reach the part of the population it is 
meant for. The part whom have less access to education.

Waerth/Walter


___
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


Re: [Foundation-l] Where do our readers come from?

2010-01-14 Thread Mark Williamson
I think there are two main factors influencing this:

# Fluency of the Internet-using population of a country in English. In a
country like Japan, basic English is widespread but real reading
comprehension on the level necessary for reading WP articles is not (as far
as I know at least). Scandinavians, on the other hand, fall at the other end
of the spectrum - according to Wikipedia, 89% of Swedes have a working
knowledge of English.

# Quality of the native Wikipedia - if I can speak some English, would it be
worth it to me to look for articles in English instead of my native language
due to greater quality or completeness of the English Wikipedia? If I'm
German, I have much less motivation to read articles in English than if my
native language is Burmese. Of course, this is in purely relative terms -
people in Arab countries preferring English to Arabic for Wikipedia does not
mean that the Arabic Wikipedia is of poor quality, it just means that users
feel that the English Wikipedia is a more reliable or complete resource in
some way.

Mark

On Thu, Jan 14, 2010 at 2:40 AM, Andre Engels andreeng...@gmail.com wrote:

 On Thu, Jan 14, 2010 at 5:27 AM, Erik Zachte erikzac...@infodisiac.com
 wrote:
  Today I released 4 new reports, which all focus on:
 
  Where do our readers come from?
 
 
 
   http://tinyurl.com/yhdej3j http://tinyurl.com/yhdej3j

 Going through the countries, another remarkable result in my opinion
 is the Ukraine - Ukrainian is not a small language by any means, yet
 Wikipedia visitors tend to be drawn to the Russian Wikipedia instead.

 Also, the Q3-Q4 comparison for most countries shows a shift from
 English to the 'vernacular'. Do you have data on this from a longer
 period of time? That is, is this part of an ongoing shift, or is it a
 seasonal effect (perhaps having to do with Q3 containing the school
 holidays in most countries?

 To quantify this, I have taken the 50 largest countries, excluding
 languages where English is the main language (United States, United
 Kingdom, Canada, Australia, India, Philippines, Singapore, Ireland,
 New Zealand, South Africa). For all countries I have compared the
 percentage going to the main language Wikipedia and those going to the
 English Wikipedia (in the Ukrainian case: the Russian Wikipedia), and
 also the 'swing' (in the way the term is used in UK politics, see
 http://en.wikipedia.org/wiki/Swing_%28United_Kingdom%29) from English
 to the local language (or in the reverse direction, if it is
 negative). For countries such as Spain and Belgium which have more
 than one local language, the similar data with all local languages are
 also given.

 Japan: Japanese 92.2% over English (swing -0.4%)
 Germany: German 72.2% over English (swing 1.5%)
 France: French 67.5% over English (swing 4.1%)
 Poland: Polish 71.5% over English (swing 4.0%)
 Italy: Italian 71.5% over English (swing 4.7%)
 Mexico: Spanish 71.5% over English (swing 3.4%)
 Brazil: Portuguese 67.7% over English (swing 1.1%)
 Spain: Spanish 60.3% over English (swing 7.0%) - vernaculars 64.4%
 over English (swing 8.6%)
 Netherlands: Dutch 10.4% over English (swing 6.6%)
 Russia: Russian 70.2% over English (swing 4.9%)
 Sweden: Swedish 13.8% over English (swing 8.1%)
 Switzerland: German 36.6% over English (swing 2.1%) - vernaculars
 55.0% over English (swing 2.7%)
 Austria: German 65.1% over English (swing -1.1%)
 Finland: Finnish 24.7% over English (swing 2.2%) - vernaculars 26.8%
 over English (swing 2.8%)
 China: Chinese 4.8% over English (swing -7.3%)
 Turkey: Turkish 48.7% over English (swing 11.7%)
 Belgium: Dutch 9.5% over English (swing 9.2%) - vernaculars 40.1% over
 English (swing 9.6%)
 Argentina: Spanish 66.2% over English (swing 1.2%)
 Norway: Norwegian (Bokmal) 0.9% UNDER English (swing 14.4%) -
 vernaculars 0.1% over English (swing 14.5%)
 Colombia: Spanish 56.3% over English (swing -3.8%)
 Czech Republic: Czech 44.3% over English (swing 10.2%)
 Hong Kong: Chinese equal to English (swing 1.0%) - vernaculars 1.4%
 over English (swing 1.2%)
 Taiwan: Chinese 45.5% over English (swing 3.7%) - vernaculars 45.7%
 over English (swing 3.7%)
 Chile: Spanish 60.6% over English (swing -2.0%)
 Israel: Hebrew 10.9% over English (swing 3.9%) - vernaculars 12.8%
 over English (swing 3.9%)
 Indonesia: Indonesian 10.2% over English (swing 8.5%) - vernaculars
 11.3% over English (swing 8.4%)
 Portugal: Portuguese 11.9% over English (swing 2.2%)
 South Korea: Korean 2.7% over English (swing 12.8%)
 Malaysia: Malay 74.5% UNDER English (swing -1.0%)
 Peru: Spanish 74.5% over English (swing 2.1%)
 Venezuela: Spanish 77.5% over English (swing 11.1%)
 Ukraine: Ukrainian 56.6% UNDER RUSSIAN (swing 4.4%)
 Romania: Romanian 21.7% UNDER English (swing 12.6%) - vernaculars
 18.5% UNDER English (swing 13.4%)
 Thailand: Thai 18.9% over English (swing -3.5%)
 Denmark: Danish 12.3% UNDER English (swing 10.7%)
 Hungary: Hungarian 23.8% over English (swing 6.1%)
 Uruguay: Spanish 72.4% over English (swing 

Re: [Foundation-l] Where do our readers come from?

2010-01-14 Thread Ziko van Dijk
Hello,
Thank you for the numbers, Erik!
I wonder why 40 % of the visitors of ksh.WP (the dialect of Cologne) are
from Japan. And why 25 % of the visitors of eu.WP (Basque) are from Poland?
Kind regards
Ziko




2010/1/14 Mark Williamson node...@gmail.com

 I think there are two main factors influencing this:

 # Fluency of the Internet-using population of a country in English. In a
 country like Japan, basic English is widespread but real reading
 comprehension on the level necessary for reading WP articles is not (as far
 as I know at least). Scandinavians, on the other hand, fall at the other
 end
 of the spectrum - according to Wikipedia, 89% of Swedes have a working
 knowledge of English.

 # Quality of the native Wikipedia - if I can speak some English, would it
 be
 worth it to me to look for articles in English instead of my native
 language
 due to greater quality or completeness of the English Wikipedia? If I'm
 German, I have much less motivation to read articles in English than if my
 native language is Burmese. Of course, this is in purely relative terms -
 people in Arab countries preferring English to Arabic for Wikipedia does
 not
 mean that the Arabic Wikipedia is of poor quality, it just means that users
 feel that the English Wikipedia is a more reliable or complete resource in
 some way.

 Mark

 On Thu, Jan 14, 2010 at 2:40 AM, Andre Engels andreeng...@gmail.com
 wrote:

  On Thu, Jan 14, 2010 at 5:27 AM, Erik Zachte erikzac...@infodisiac.com
  wrote:
   Today I released 4 new reports, which all focus on:
  
   Where do our readers come from?
  
  
  
http://tinyurl.com/yhdej3j http://tinyurl.com/yhdej3j
 
  Going through the countries, another remarkable result in my opinion
  is the Ukraine - Ukrainian is not a small language by any means, yet
  Wikipedia visitors tend to be drawn to the Russian Wikipedia instead.
 
  Also, the Q3-Q4 comparison for most countries shows a shift from
  English to the 'vernacular'. Do you have data on this from a longer
  period of time? That is, is this part of an ongoing shift, or is it a
  seasonal effect (perhaps having to do with Q3 containing the school
  holidays in most countries?
 
  To quantify this, I have taken the 50 largest countries, excluding
  languages where English is the main language (United States, United
  Kingdom, Canada, Australia, India, Philippines, Singapore, Ireland,
  New Zealand, South Africa). For all countries I have compared the
  percentage going to the main language Wikipedia and those going to the
  English Wikipedia (in the Ukrainian case: the Russian Wikipedia), and
  also the 'swing' (in the way the term is used in UK politics, see
  http://en.wikipedia.org/wiki/Swing_%28United_Kingdom%29) from English
  to the local language (or in the reverse direction, if it is
  negative). For countries such as Spain and Belgium which have more
  than one local language, the similar data with all local languages are
  also given.
 
  Japan: Japanese 92.2% over English (swing -0.4%)
  Germany: German 72.2% over English (swing 1.5%)
  France: French 67.5% over English (swing 4.1%)
  Poland: Polish 71.5% over English (swing 4.0%)
  Italy: Italian 71.5% over English (swing 4.7%)
  Mexico: Spanish 71.5% over English (swing 3.4%)
  Brazil: Portuguese 67.7% over English (swing 1.1%)
  Spain: Spanish 60.3% over English (swing 7.0%) - vernaculars 64.4%
  over English (swing 8.6%)
  Netherlands: Dutch 10.4% over English (swing 6.6%)
  Russia: Russian 70.2% over English (swing 4.9%)
  Sweden: Swedish 13.8% over English (swing 8.1%)
  Switzerland: German 36.6% over English (swing 2.1%) - vernaculars
  55.0% over English (swing 2.7%)
  Austria: German 65.1% over English (swing -1.1%)
  Finland: Finnish 24.7% over English (swing 2.2%) - vernaculars 26.8%
  over English (swing 2.8%)
  China: Chinese 4.8% over English (swing -7.3%)
  Turkey: Turkish 48.7% over English (swing 11.7%)
  Belgium: Dutch 9.5% over English (swing 9.2%) - vernaculars 40.1% over
  English (swing 9.6%)
  Argentina: Spanish 66.2% over English (swing 1.2%)
  Norway: Norwegian (Bokmal) 0.9% UNDER English (swing 14.4%) -
  vernaculars 0.1% over English (swing 14.5%)
  Colombia: Spanish 56.3% over English (swing -3.8%)
  Czech Republic: Czech 44.3% over English (swing 10.2%)
  Hong Kong: Chinese equal to English (swing 1.0%) - vernaculars 1.4%
  over English (swing 1.2%)
  Taiwan: Chinese 45.5% over English (swing 3.7%) - vernaculars 45.7%
  over English (swing 3.7%)
  Chile: Spanish 60.6% over English (swing -2.0%)
  Israel: Hebrew 10.9% over English (swing 3.9%) - vernaculars 12.8%
  over English (swing 3.9%)
  Indonesia: Indonesian 10.2% over English (swing 8.5%) - vernaculars
  11.3% over English (swing 8.4%)
  Portugal: Portuguese 11.9% over English (swing 2.2%)
  South Korea: Korean 2.7% over English (swing 12.8%)
  Malaysia: Malay 74.5% UNDER English (swing -1.0%)
  Peru: Spanish 74.5% over English (swing 2.1%)
  Venezuela: Spanish 77.5% over English (swing 

Re: [Foundation-l] Where do our readers come from?

2010-01-14 Thread Nikola Smolenski
Ziko van Dijk wrote:
 Thank you for the numbers, Erik!
 I wonder why 40 % of the visitors of ksh.WP (the dialect of Cologne) are
 from Japan. And why 25 % of the visitors of eu.WP (Basque) are from Poland?

Bots?

___
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


Re: [Foundation-l] Where do our readers come from?

2010-01-14 Thread Andre Engels
On Thu, Jan 14, 2010 at 2:46 PM, Nikola Smolenski smole...@eunet.rs wrote:
 Ziko van Dijk wrote:
 Thank you for the numbers, Erik!
 I wonder why 40 % of the visitors of ksh.WP (the dialect of Cologne) are
 from Japan. And why 25 % of the visitors of eu.WP (Basque) are from Poland?

 Bots?

I think that's a likely explanation in the eu case (unless Erik is
using an algorithm that filters out bots) - I see Poles come up high
in more unexpected small languages (Telugu, Welsh, Alemannic, Frisian,
Cebuan, Norman, Crimean Tartar) - although Basque seems to be the
biggest of the lot.

-- 
André Engels, andreeng...@gmail.com

___
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


Re: [Foundation-l] Where do our readers come from?

2010-01-14 Thread Nikola Smolenski
Erik Zachte wrote:
 Today I released 4 new reports, which all focus on: 
 
 Where do our readers come from?

And, (sorry) one more question: is the first time that such reports are 
being released?

___
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


Re: [Foundation-l] Where do our readers come from?

2010-01-14 Thread Nikola Smolenski
Andre Engels wrote:
 Going through the countries, another remarkable result in my opinion
 is the Ukraine - Ukrainian is not a small language by any means, yet
 Wikipedia visitors tend to be drawn to the Russian Wikipedia instead.
 
 Also, the Q3-Q4 comparison for most countries shows a shift from
 English to the 'vernacular'. Do you have data on this from a longer
 period of time? That is, is this part of an ongoing shift, or is it a
 seasonal effect (perhaps having to do with Q3 containing the school
 holidays in most countries?

In Page Views Per Wikipedia Language - Breakdown I also notice something 
that should affect chapter relations: there are some Wikipedias which 
are read from foreign countries more than from the country of origin 
(probably b/c readers from diaspora is richer and has better Internet 
access).

For example, Macedonian Wikipedia is read more from Slovenia or Germany 
than from Macedonia:

Macedonian (mk) (0.02% share of global total)
Slovenia30.6%
Germany 23.7%
Macedonia   23.3%

It would therefore make sense for WMDE to try to reach Macedonians 
living in Germany, and for future WMMK to help them in doing so.

___
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


Re: [Foundation-l] Where do our readers come from?

2010-01-14 Thread Nikola Smolenski
Nikola Smolenski wrote:
 In Page Views Per Wikipedia Language - Breakdown I also notice something 
 that should affect chapter relations: there are some Wikipedias which 

Also, any ideas why is Commons so popular in Spain and Latin America?

Commons (commons) (0.010% share of global total)
Spain   30.0%
United States   29.2%
Brazil  8.5%
Argentina   4.8%
Mexico  3.9%

___
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


Re: [Foundation-l] Where do our readers come from?

2010-01-14 Thread Andrew Gray
2010/1/14 Nikola Smolenski smole...@eunet.rs:
 Nikola Smolenski wrote:
 In Page Views Per Wikipedia Language - Breakdown I also notice something
 that should affect chapter relations: there are some Wikipedias which

 Also, any ideas why is Commons so popular in Spain and Latin America?

Some Wikipedias - the ones which insist on only-free-images - do not
use local uploads at all, and instead direct everyone to Commons. Both
es.wikipedia and pt.wikipedia work this way, so they'll send a lot
more of their users to Commons than a project which uses local image
uploads.

As a result, I suspect you'll find that traffic to Commons increases
proportionately with traffic to Spanish/Portuguese Wikipedia usage.

-- 
- Andrew Gray
  andrew.g...@dunelm.org.uk

___
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


Re: [Foundation-l] Where do our readers come from?

2010-01-14 Thread Marcus Buck
Nikola Smolenski hett schreven:
 In Page Views Per Wikipedia Language - Breakdown I also notice something 
 that should affect chapter relations: there are some Wikipedias which 
 are read from foreign countries more than from the country of origin 
 (probably b/c readers from diaspora is richer and has better Internet 
 access).

 For example, Macedonian Wikipedia is read more from Slovenia or Germany 
 than from Macedonia:

 Macedonian (mk) (0.02% share of global total)
 Slovenia  30.6%
 Germany   23.7%
 Macedonia 23.3%

 It would therefore make sense for WMDE to try to reach Macedonians 
 living in Germany, and for future WMMK to help them in doing so.

It would make sense. But at the moment  WMDE is not even actively doing 
anything for the _native_ languages of Germany except for German. I 
think that would be the first step to do.

Marcus Buck

___
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


Re: [Foundation-l] Where do our readers come from?

2010-01-14 Thread Marco Chiesa
On Thu, Jan 14, 2010 at 3:51 PM, Marcus Buck m...@marcusbuck.org wrote:

 It would make sense. But at the moment  WMDE is not even actively doing
 anything for the _native_ languages of Germany except for German. I
 think that would be the first step to do.


I had a quick look at the native languages of Italy, and I found out
that the percentage of visits from Italy is much smaller for the
regional languages:
Italian: 90.4%
Neapolitan: 45.8%
Tarantino: 43.2%
Emiliano-Romagnolo: 34.5%
Venetian: 33.9%
Lombard: 29.5%
Sicilian: 27.6%
Sardinian: 26.4%
Piedmontese: 24.8%
Friulian: 17.8%
Ligurian: 17.6%

I see a couple of reasons for this difference:
1) Bot visits count proportionally much more in smaller wikis
2) We know that, at least in some of these projects, a lot of
contributors are migrants (even 2nd or 3rd generation) that try to
maintain the regional languages their parents/grandparents used (Italy
had a lot of emigration in the 20th century), so it shouldn't be hard
to imagine that the same happens for the readers. This also partly
explains why Wikimedia Italia has little penetration within this
projects.

It would be interesting to see if the same happens for other
countries, for example Germany
Cruccone

___
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


Re: [Foundation-l] Where do our readers come from?

2010-01-14 Thread Nikola Smolenski
Дана Thursday 14 January 2010 09:24:16 Nikola Smolenski написа:
 At Wikipedia Page Views Per Country - Overview, could you in future
 include number of Internet users (f.e. from
 http://en.wikipedia.org/wiki/List_of_countries_by_number_of_Internet_users
 ) and number of views per Internet user? IMO, this is more useful than
 population and could identify countries where Wikipedia should be
 advertised.

Did it: 
http://smolenski.rs/blog/2010/01/wikipedia-page-views-per-country-with-internet-users/

___
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


[Foundation-l] Where do our readers come from?

2010-01-13 Thread Erik Zachte
Today I released 4 new reports, which all focus on: 

Where do our readers come from?

 

 http://tinyurl.com/yhdej3j http://tinyurl.com/yhdej3j

 

Cheers, Erik Zachte

 

___
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


Re: [Foundation-l] Where do our readers come from?

2010-01-13 Thread teun spaans
Hi Erik,

thank you. Very nice.
One suggestion: for trends, i would expect a bar indicating upward or
downward trend, not a percentage bar.

live long and prosper
teun

On Thu, Jan 14, 2010 at 5:27 AM, Erik Zachte erikzac...@infodisiac.comwrote:

 Today I released 4 new reports, which all focus on:

 Where do our readers come from?



  http://tinyurl.com/yhdej3j http://tinyurl.com/yhdej3j



 Cheers, Erik Zachte



 ___
 foundation-l mailing list
 foundation-l@lists.wikimedia.org
 Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l

___
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l