Re: [Foundation-l] Statistics and chapters: searching for chapters

2010-01-15 Thread Joan Goma
Just a short remark: the most statistically explicative parameter for
Wikipedia activity is not the number of internet connections but GDP (except
for English and Chinese projects which exhibit singular behavior). Perhaps
you could retry the analysis using GDP and find some more countries where
chapters are achievable.

Sorry for not providing references. This comes from a not yet published
research work that applies reasonable hypothesis to transfer more than 20
parameters from country data into language data and then apply statistical
methods to search for correlations between those data and size of Wikipedia
projects.

>
> == Searching for chapters ==
>
> This analysis is about where to search for new Wikimedia chapters. It
> may be useful to the ChapCom and Board, but the other intention is to
> encourage Wikimedians from those countries to try to form their
> chapters, because it is achievable.
>
>
___
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


[Foundation-l] How are media from content partnerships used?

2010-01-15 Thread Erik Moeller
This is not news for people who've been watching closely, but I
thought it deserved a "re-post" to give it some additional visibility.

In the last year, the Wikimedia movement has developed some very
important content partnerships with cultural institutions such as
museums and archives to bring valuable pictures, videos, and other
media online. Some but not all of them are categorized here:

http://commons.wikimedia.org/wiki/Category:Commons_partnerships

What's the impact of these partnerships? How are these media used? We
didn't have good answers to these questions until very recently.
Thanks to the work of Bryan Tong Minh, Magnus Manske, and other
engineers, we now have some first good data:

1) The GlobalUsage extension is now re-deployed on Wikimedia Commons,
which makes it easy to see where any individual file is used in the
Wikimedia universe;

2) The Glamorous script by Magnus Manske gives you that overview for
an entire category on Commons.

For example, you can go to http://toolserver.org/~magnus/glamorous.php
and select the "Images from the German Federal Archive" category. This
will show you that out of the 82,457 images uploaded so far, more than
15,000 are currently used in articles. 34 languages use at least 100
images, 11 use at least 1,000.  This demonstrates the powerful dynamic
of global re-use that uploading media to Wikimedia Commons can result
in.

We'll be able to show even more compelling data if we now add the
(known) pageview data for the relevant articles. Hopefully this
emerging data will contribute to a virtuous circle of new content
partnerships.  I'll pull together some facts for a blog update on
what's happening in the space, but wanted to give a general quick
update first. :-)
-- 
Erik Möller
Deputy Director, Wikimedia Foundation

Support Free Knowledge: http://wikimediafoundation.org/wiki/Donate

___
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


foundation-l@lists.wikimedia.org

2010-01-15 Thread Mark Williamson
I notice in that list both Belarusian Wikipedias are listed just as
"Belarusian Wikipedia". It would be very informative to know which is which
and to have visitor statistics on both :-)

skype: node.ue


On Fri, Jan 15, 2010 at 3:39 PM, Erik Zachte wrote:

> Here is a Q&A on all issues raised:
> Q=question/R=Remark, A=answer
>
> I put the more general questions on top.
>
> Cheers, Erik Zachte
>
> --
>
> Q: Nikola Smolenski
> Is it first time these reports are published?
>
> A:
> Yes, expect trend report to grow by accretion over time.
> Other reports will be built from data for recent (6) months only
>
> --
>
> R: Andrew Gray
> Andrew explains why distribution of page requests over countries favors
> Spanish and Portuguese speaking countries:
> 'Some Wikipedias - the ones which insist on only-free-images - do not use
> local uploads at all.'
>
> A:
> Thanks for explaining this unexpected distribution of page views on
> Commons,
> I had no idea.
>
> Spain   30.0%
> USA 29.2%
> Brazil  8.5%
> Argentina   4.8%
> Mexico  3.9%
> Germany 3.3%
> France  2.1%
> Venezuela   1.9%
> Chile   1.4%
> Costa Rica  1.4%
> Italy   1.4%
> Uruguay 1.2%
> Colombia1.2%
> Portugal1.1%
>
> --
>
> R: Mark Williamson
>
> Two main factors influencing choice of Wikipedia language:
> # Fluency of the Internet-using population of a country in English.
> # Quality of the native Wikipedia.
>
> A:
> Like you say. Many Scandinavians (and Dutch people I might add) probably
> switch between English and local content all the time.
> Personally I tend to look at English Wp first I many instances, because of
> obviously richer content and larger depth.
>
> --
>
> Q: Ziko van Dijk
> Why are 40 % of the visitors of ksh.WP (the dialect of Cologne) from Japan.
> Why are 25 % of the visitors of eu.WP (Basque) from Poland?
>
> Q: Andre Engels
> I think bots are a likely explanation in the eu case
> (unless Erik is using an algorithm that filters out bots)
>
> A:
> KSH used to be code for Kashmir. Still not Japan, but much closer than
> Cologne.
> Maybe Japanese mountaineers caused this spike ? (only half kidding)
>
> As for eu.wp: Would Polish presume there also is a European Wikipedia? Just
> a guess.
>
> I do filter bots
>
> --
>
> R: Teun Spaans
> For trends, I would expect a bar indicating upward or downward trend, not a
> percentage bar.
>
> A:
> We can have both, a notion of importance and of change: I might color code
> cells as I do already in e.g. [1]
> This way large fluctuations really stand out. Let's first collect more
> history.
>
> [1] http://stats.wikimedia.org/EN/TablesPageViewsMonthly.htm
>
>
> --
>
> Q: Nikola Smolenski
> Could we get this for other projects?
>
> A:
> This question is of course not unexpected.
> One consideration is we need a certain sample size to make numbers
> significant.
> For other projects, with far less traffic, few country/language pairs would
> be backed by sufficient data.
> See also below on extending the current reports with more table rows.
>
> --
>
> Q: Nikola Smolenski:
> Please include at Wikipedia Page Views Per Country - Overview [1] number of
> Internet users from [2], and number of views per Internet user?
>
> [1] http://tinyurl.com/yk43aq6
> [2] http://tinyurl.com/yfv5bwn
>
> A:
> Done
>
> --
>
> R: Nikola Smolenski
> It is obvious why Slovene Wikipedia is highly visited in Sierra Leone, and
> Serbian in Suriname; URLs do matter :)
> Although, I don't understand why so much. I would expect this distribution
> by visitors, perhaps, but not by visits.
>
> A:
> Very interesting observation! So people from Sierra Leone try
> 'sl.wikipedia.org'.
> Why people from Surinam go to 'sr.wikimedia.org' is only slightly less
> obvious to me, but apparently is happens
>
> For countries with just a few hits in the sampled log the distinction
> between visitors and visits gets blurred.
>
> --
>
> R: Andre Engels
> Ukrainian is not a small language by any means, yet Wikipedia visitors tend
> to be drawn to the Russian Wikipedia instead.
>
> A: Yes but article growth in Ukrainian Wikipedia has been speeding up in
> recent months. [1]
>
> [1] http://stats.wikimedia.org/EN/TablesWikipediaUK.htm
>
> --
>
> R: Andre Engels
> The Q3-Q4 comparison for most countries shows a shift from English to the
> 'vernacular'.
>
> A:
> Interesting analysis. Let's see if this is a consistent trend.
> However the monthly page views per Wikipedia language for which we have 2
> year history do not show very significant shift from large to smaller
> wikipedia's.
> See table 'Di

foundation-l@lists.wikimedia.org

2010-01-15 Thread Petr Kadlec
2010/1/15 Erik Zachte :
> Very interesting observation! So people from Sierra Leone try
> 'sl.wikipedia.org'.
> Why people from Surinam go to 'sr.wikimedia.org' is only slightly less
> obvious to me, but apparently is happens

Well, Suriname’s TLD is .sr, so it is quite obvious, isn’t it? The
same frequent mistake is also the reason there is a redirection
cz.wikipedia.org → cs.wikipedia.org (Czech language is “cs” according
to ISO 639-1, but Czech Republic’s TLD is “.cz” according to ISO
3166-1).

-- [[cs:User:Mormegil | Petr Kadlec]]

___
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


foundation-l@lists.wikimedia.org

2010-01-15 Thread Erik Zachte
Here is a Q&A on all issues raised:
Q=question/R=Remark, A=answer

I put the more general questions on top. 

Cheers, Erik Zachte

--

Q: Nikola Smolenski
Is it first time these reports are published?

A: 
Yes, expect trend report to grow by accretion over time.
Other reports will be built from data for recent (6) months only

--

R: Andrew Gray
Andrew explains why distribution of page requests over countries favors
Spanish and Portuguese speaking countries:
'Some Wikipedias - the ones which insist on only-free-images - do not use
local uploads at all.'

A: 
Thanks for explaining this unexpected distribution of page views on Commons,
I had no idea.

Spain   30.0%   
USA 29.2%   
Brazil  8.5%
Argentina   4.8%
Mexico  3.9%
Germany 3.3%
France  2.1%
Venezuela   1.9%
Chile   1.4%
Costa Rica  1.4%
Italy   1.4%
Uruguay 1.2%
Colombia1.2%
Portugal1.1%
 
--

R: Mark Williamson

Two main factors influencing choice of Wikipedia language:
# Fluency of the Internet-using population of a country in English. 
# Quality of the native Wikipedia.

A: 
Like you say. Many Scandinavians (and Dutch people I might add) probably
switch between English and local content all the time.
Personally I tend to look at English Wp first I many instances, because of
obviously richer content and larger depth. 

--

Q: Ziko van Dijk
Why are 40 % of the visitors of ksh.WP (the dialect of Cologne) from Japan. 
Why are 25 % of the visitors of eu.WP (Basque) from Poland?

Q: Andre Engels
I think bots are a likely explanation in the eu case 
(unless Erik is using an algorithm that filters out bots)

A: 
KSH used to be code for Kashmir. Still not Japan, but much closer than
Cologne.
Maybe Japanese mountaineers caused this spike ? (only half kidding)

As for eu.wp: Would Polish presume there also is a European Wikipedia? Just
a guess.

I do filter bots

--

R: Teun Spaans
For trends, I would expect a bar indicating upward or downward trend, not a
percentage bar.

A: 
We can have both, a notion of importance and of change: I might color code
cells as I do already in e.g. [1] 
This way large fluctuations really stand out. Let's first collect more
history. 

[1] http://stats.wikimedia.org/EN/TablesPageViewsMonthly.htm


--

Q: Nikola Smolenski
Could we get this for other projects?

A:
This question is of course not unexpected. 
One consideration is we need a certain sample size to make numbers
significant.
For other projects, with far less traffic, few country/language pairs would
be backed by sufficient data. 
See also below on extending the current reports with more table rows. 

--

Q: Nikola Smolenski:
Please include at Wikipedia Page Views Per Country - Overview [1] number of
Internet users from [2], and number of views per Internet user? 

[1] http://tinyurl.com/yk43aq6
[2] http://tinyurl.com/yfv5bwn

A:
Done 

--

R: Nikola Smolenski
It is obvious why Slovene Wikipedia is highly visited in Sierra Leone, and
Serbian in Suriname; URLs do matter :)
Although, I don't understand why so much. I would expect this distribution
by visitors, perhaps, but not by visits.

A:
Very interesting observation! So people from Sierra Leone try
'sl.wikipedia.org'. 
Why people from Surinam go to 'sr.wikimedia.org' is only slightly less
obvious to me, but apparently is happens

For countries with just a few hits in the sampled log the distinction
between visitors and visits gets blurred.

--

R: Andre Engels
Ukrainian is not a small language by any means, yet Wikipedia visitors tend
to be drawn to the Russian Wikipedia instead.

A: Yes but article growth in Ukrainian Wikipedia has been speeding up in
recent months. [1]

[1] http://stats.wikimedia.org/EN/TablesWikipediaUK.htm

--

R: Andre Engels
The Q3-Q4 comparison for most countries shows a shift from English to the
'vernacular'.

A: 
Interesting analysis. Let's see if this is a consistent trend.
However the monthly page views per Wikipedia language for which we have 2
year history do not show very significant shift from large to smaller
wikipedia's. 
See table 'Distribution of page views' at bottom of page of [1]: smaller
languages gain in share of page views, but very slowly.
 
[1] http://stats.wikimedia.org/EN/TablesPageViewsMonthly.htm

--

Q: Nikola Smolenski / Milos Rancic
At Wikipedia Page Views By Country - Breakdown [1] and Wikipedia Page Views
By Country - Trends [2] could you include more languages (ideally all
languages)?
Some of the numbers are going below 0.1%

Re: [Foundation-l] Statistics and chapters (The China discussion)

2010-01-15 Thread Philippe Beaudette

On Jan 15, 2010, at 4:29 AM, Nikola Smolenski wrote:

> but a chapter in China, [...] is something the Foundation might well  
> spend money and dedicate
> people to actively build

There's a lot of discussion and data around China on the strategy wiki  
at http://strategy.wikimedia.org/wiki/China , 
http://strategy.wikimedia.org/wiki/Task_force/China 
  and http://strategy.wikimedia.org/wiki/Category:China_Task_Force
(both data that would support this argument and data that might raise  
some red flags).

I'd encourage those who are interested in this topic to read it and  
engage there.

Philippe


Philippe Beaudette  
Facilitator, Strategy Project
Wikimedia Foundation

phili...@wikimedia.org

mobile: 918 200-WIKI (9454)

Imagine a world in which every human being can freely share in
the sum of all knowledge.  Help us make it a reality!

http://wikimediafoundation.org/wiki/Donate

___
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


Re: [Foundation-l] Statistics and chapters: searching for chapters

2010-01-15 Thread Nikola Smolenski
Milos Rancic wrote:
> Countries are listed from the most number of Internet users to the
> least number of Internet users (with some groupings). If you are

Shouldn't the idea be to try to support creation of chapters in 
countries with the most users but the smallest activity? These would be, 
in order: China, Iran, Nigeria, Sudan, Syria, Uganda, Uzbekistan, 
Zimbabwe, Haiti , Zambia, Tajikistan, South Korea, Vietnam, Pakistan, 
Egypt etc. We don't need a chapter in Uruguay where Wikipedia is already 
more popular than in Germany.

(Just to clarify, if some people from Uruguay decide to form a chapter, 
of course that should be accepted; but a chapter in China, Iran or 
Nigeria is something the Foundation might well spend money and dedicate 
people to actively build.)

___
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


[Foundation-l] Statistics and chapters: searching for chapters

2010-01-15 Thread Milos Rancic
Based on Erik's statistics [1] and Nikola's addition of Internet users
[2] and the list of Wikimedia chapters [3], here is the first set of
conclusions.

== Searching for chapters ==

This analysis is about where to search for new Wikimedia chapters. It
may be useful to the ChapCom and Board, but the other intention is to
encourage Wikimedians from those countries to try to form their
chapters, because it is achievable.

=== Methodology ===
I used Nikola's table and:

* I removed all countries with less than 499.000 Internet users
(actually, the initial idea was to remove all countries with less than
500.000, but Nepal was very close that number). Those are countries
where we have potential to create a chapter in the relatively near
future. Existing chapter in a country with the smallest number of
Internet users (and smallest number of inhabitants) is Wikimedia
Macedonia, with 1,100,000 Internet users. I think that it is
reasonable to expect chapters in countries with somewhat smaller
number of Internet users. It removed countries/territories in which
some initiatives already exist, like Iceland and Macao are; but my
analysis is not about where *not* to search for chapters, but where to
search for chapters. In other words: if some group is able to create a
chapter in a country with less Internet users, it would be good. Also,
it should be noted that this is just about the present situation.
Internet adoption is increasing and I expect that more countries will
pass my fictional line.
* I sorted them according to the number of Internet users.
* Then, I marked them according to data at the Wikimedia chapters page
[3]: chapter exists, chapter is planned, chapter is in discussion or
"there are more possible chapters". Inside of the last category are
USA (with one existing chapter), Canada (with two options: one or more
chapters), Spain (with one national and one regional chapter) and
India (the situation is not clear, at least to me). Those potential
chapters are excluded from my analysis. It should be noted that just
"existing chapters" are reliable category. As a member of ChapCom, I
know that even some groups inside of the category "ideas for chapters"
came further than some "planned chapters". Because of that, I will
list them as they are the same.

=== We need chapters in ===

Countries are listed from the most number of Internet users to the
least number of Internet users (with some groupings). If you are
interested in chapter creation inside of a particular country and you
see that the stage of chapter creation is "planned", "in discussion"
or "an idea for chapter exists", please go to the appropriate page or
to Wikimedia chapters page [3] and see who is involved there. (Help
from other ChapCom members would be appreciated because it is possible
that I missed some initiative.)

* China [4]. AFAIK, Ting is working on Chinese issues.
* Japan [5]. I think that Japanese Wikimedians should reconsider their
position (from 2007, [3]) that there are no willingness for creation
of chapter. Creative Commons have strong organization there and I am
sure that they will help to Wikimedians.
* South Korea is planned chapter [6]. Project was active in the first
part of 2009, but not since then.
* Iran is in discussion [7]. Project has been initiated in 2008, but
it is not active anymore. I think that it will wait until internal
political situation in Iran will be solved.
* Colombia [8]. Was active at some point of time. Some activity still exists.
* Egypt [9]. The idea for Wikimedia Egypt is at the solid grounds.
* Romania [10]. I didn't hear anything about Romania for years.
* Pakistan [11], Malaysia [12], Saudi Arabia [13]: Listed as in
discussion, but virtually nothing.
* Mexico: Some initiative exists, even they are not listed. Relevant
Wikimedians are from Mexico, so they should think about organizing a
chapter.
* Turkey [14]. If I counted well, this is the newest idea for chapter.
* Vietnam, Thailand, Peru. Nothing which I know
* Nigeria, Chile, Morocco: Initiatives exist, but not in the stage
that they did anything on Wikimedia chapters page.
* Belgium [15]. I didn't hear anything about Belgium for long time.
* Greece. There were some initiatives, but nothing developed.
* Algeria [16]. Listed as "in discussion", but never really alive.
* Slovakia, Syria: Nothing which I know.
* Singapore: I think that there was some initiative, but I can't find anything.
* New Zealand [17]. Was an idea in 2006, but dead since then.
* Belarus, Dominican Republic, Tunisia [18]: Initiatives exist at
various degrees. Dominican Republic is the newest initiative, probably
the most serious.
* United Arab Emirates, Bulgaria, Uganda, Uzbekistan, Kazakhstan:
Nothing which I know.
* Croatia [19]. Planned, but delayed.
* Lithuania [20]: In discussion, but culmination was in 2007.
* Jordan. Initiative existed at the beginning of 2009.
* Guatemala, Jamaica, Azerbaijan, Costa Rica, Cuba, Zimbabwe, Latvia:
Nothing which I know.
* Ecuador, Bosnia

Re: [Foundation-l] Where do our readers come from?

2010-01-15 Thread Milos Rancic
On Thu, Jan 14, 2010 at 5:27 AM, Erik Zachte  wrote:
> Today I released 4 new reports, which all focus on:
>
> Where do our readers come from?
>
>
>
>   http://tinyurl.com/yhdej3j
>
>
>
> Cheers, Erik Zachte
>
>
>
> ___
> foundation-l mailing list
> foundation-l@lists.wikimedia.org
> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
>

Erik, may you put somewhere full statistics? Some of the numbers are
going below 0.1% of population, but some of them are not mentioned
even they are larger than 0.5% of population.

___
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l