Re: [Wiki-research-l] Country (culture...) as a factor in contributing to collective intelligence projects

2018-07-25 Thread Kerry Raymond
Another issue in terms of choice of language to contribute in could relate to 
their motivation to add the content and presumed audience for the content. A 
multi-lingual person might decide to write about (say) magnetism in English (or 
other widely spoken language) in the belief that magnetism is of worldwide 
interest, but might choose to write about a local folk story  in a more local 
language in the belief that it is likely to be of interest only to local people.

Also given that there are different policies on different Wikipedias, it may be 
that a topic might not pass notability on English Wikipedia but be entirely 
acceptable on another Wikipedia.

Also, my observation of English Wikipedia is that regular contributors tend to 
divide into article-starters (a smaller group) and article-expanders (a much 
larger group). If there are cultural reasons (or Wikipedia policy reasons) why 
people fluent in one language are less likely to be article starters, this may 
limit the range of topics for the article-expanders to work on and hence the 
growth of the encyclopedia overall. There may also be cultural reasons why 
certain types of article are not started in some Wikipedias, e.g. popular 
culture articles (e.g. Pokemon characters) might not be seen as "encyclopedic" 
in some cultures.

As to the specific difference between Polish Wikipedia and South Korean 
Wikipedia, I would observe that South Korea is a nation obsessed with computer 
gaming both for personal leisure through to professional sport, and it is a 
very time-consuming passion.

https://en.wikipedia.org/wiki/Video_gaming_in_South_Korea

So maybe gaming takes away the time from those who might otherwise contribute 
to Wikipedia.

Kerry


___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


Re: [Wiki-research-l] Country (culture...) as a factor in contributing to collective intelligence projects

2018-07-25 Thread Jonathan Cardy
Hi, the second most obvious factor is going to be the availability of internet 
access, but also the type of internet access, and how long people have had 
internet access.

The unproven assumption is that Wikipedia is written by people with internet 
experience and leisure time access to the internet via the desktop environment.

Over a decade ago when I was working in a marketing company, there was a rule 
of thumb that people only started shopping on the internet after two years of 
internet experience. I don’t know if that was ever scientifically tested, or 
what the equivalent would be for editing Wikipedia, but I’m pretty sure that 
editing Wikipedia is not an entry level experience on the internet.

We do know both from experience of training people to edit Wikipedia and also 
from looking at recent changes, that Wikipedia is almost a broadcast media in 
the mobile environment. There are some people who edit on tablets and even 
smartphones, but the editing community is mostly via the desktop environment. 
Just to confuse things desktop doesn't just include laptops in this context, 
there are even people using tablets but opting for the desktop environment 
rather than the mobile  one.

 So two languages with similar populations on the internet could have radically 
different Wikipedia sizes because in one culture access is fairly new and 
mostly smartphone based whilst in the other it is a longstanding thing with a 
large proportion of experienced Internet users with PC access.

The biggest difference though is going to be the policy of that Wikipedia 
community re bot creation of articles, with Cebuano, Swedish and Waray at one 
extreme. Such policies change over time, the English Wikipedia went through one 
of its early growth surges when a bot was used to start articles on all 
populated places in the USA, so it would be an oversimplification to simply 
list English as one of the Wikipedias that is currently chary about bot 
creation of articles. A very simplistic way to look at this is to order 
Wikipedias not by number of articles but by number of edits. On that basis 
Polish with 53m edits would drop behind the rather smaller Japanese Wikipedia 
as that has 69 million edits. Cebuano with 5.3 million articles but only 23 m 
edits would drop a long way from second place.

Other theories re differences between sizes of Wikipedia include ones re 
multilingual people. Phenomena such as the tendency of Indian editors to edit 
in English rather than Indic languages. One theory is that people are editing 
in a language that they perceive as “higher status” another that Wikipedians 
have multiple motivations and that some people edit in a language they are not 
fully fluent in in order to practice that language, a third is that Wikipedia 
is written in the correct alphabet for each language, but many people only have 
access to Latin keyboards. I am familar with this from Georgia where a large 
proportion of Georgians communicate on sites such as Facebook writing Georgian 
in the Latin script, but last I heard Wikipedia editing is restricted to those 
who can switch to Georgian script. Obviously this last issue is changing over 
time as particular scripts become available on the internet or as options in 
Wikipedia editing.

I would be very interested to see your paper, thanks for picking this topic.



Get Outlook for iOS<https://aka.ms/o0ukef>

From: 30012764400n behalf of
Sent: Tuesday, July 24, 2018 10:03 am
To: Research into Wikimedia content and communities
Subject: [Wiki-research-l] Country (culture...) as a factor in contributing to 
collective intelligence projects

Dear all,

I am working on a paper on why/whether people contribute (or not) to
collective intelligence differently projects in different countries. The
paper was inspired, partially, by several discussions I had with various
people on why different language Wikipedia's have different sizes,
besides (doh) the popularity of the language (and yes, English is
biggest because it is international; and yes, I am aware a few
Wikipedias are outliers because of bots creating machine translations or
auto-populating villages or such). But for example, Poland and South
Korea have roughly similar population/speakers and development status,
yet Polish Wikipedia is over 3x the size of the SK one and no bot can
account for that. So, there's more to that. I am already feeding dozens
of parameters to a spreadsheet for some modelling, but I a) wonder what
I might have missed - before a reviewer asks 'why didn't you check for
xyz' and b) would like to have a few nice sentences about how things
that people expect to matter do not (or vice versa). Hence, my question
to you all, in the form of this open question mini survey:

Why do you think different language Wikipedia's have different sizes,
outside of the popularity of a given language?

For reference, list 

Re: [Wiki-research-l] Country (culture...) as a factor in contributing to collective intelligence projects

2018-07-25 Thread Piscopo A .
Very interesting project indeed!

There is a study presented at Hypertext 2015, in which the authors compared the 
behaviour of Yahoo Answer users across several countries.
To perform their comparison, they used cultural metrics from previous studies, 
which you may find useful.
Here’s the paper: 
http://www.cse.usf.edu/dsg/data/publications/papers/culture_ht.pdf

Hope this can be useful.
Alessandro

–––
Alessandro Piscopo
Web and Internet Science Group
School of Electronics and Computer Science
University of Southampton
email: a.pisc...@soton.ac.uk

On 24 Jul 2018, at 18:27, Peter Meyer 
mailto:econte...@gmail.com>> wrote:

Along this line I saw a terrific study recently looking at patent coauthors.  
Patents can be filed by individuals or by multiple individuals, and if people 
work together on patents in different groups this builds “networks” among 
inventors, in which they have previous coauthorship links.  If patents are 
filed only by single individuals there might be just as many inventions, but 
the networks are not built together as much.

The study looked at patents in Sweden and Spain in the 19th century.  It is by 
David Andersson and Patricio Saiz who are experts in the patent data from these 
countries.  They found the Swedish patents were likely to be coauthored, and 
the Spanish ones were not.  They looked at the resulting network links.  They 
argue that it led to more industrialization and growth in the Swedish case than 
in the Spanish case.

This is very helpful and insightful I thought.  it was kind of gripping because 
they make a connection over the course of 100 years, in which the individuals 
from the early period are no longer relevant in the later period; it is an 
assertion about a long-lasting property.

Is this from a more cooperative culture in one place, and the opportunity for 
such networks to industrialize using later technologies?  Or, is it a result of 
different industries naturally springing up in the different countries?  Not 
entirely clear.

However the link to a fundamentally flexible cooperative cultures that exist 
before wikipedia could explain the differences in growth.   This is one paper 
to analogize to.  Maybe the places where patents are most coauthored also 
generate larger decentralized/cooperative works.

On Jul 24, 2018, at 8:19 AM, Pierre-Carl Langlais 
mailto:pierrecarl.langl...@gmail.com>> wrote:

This is a very interesting project.

Just in short remark in line with Juliana’s observation: the hardest part would 
be to account for the specific "inner" culture developed by each wikimedian 
communities. Since most of them has started on a relatively small scale, 
numerous norms and lasting social dynamics can be explained by the initial 
choices / tastes of a limited set of individuals. Of course, they may in turn 
result from a wider cultural background but also may be simply idiosyncratic.

I guess discriminating this factor would be quite hard. Perhaps using 
contributing data (when they exist) in the dumps and the archives of mailing 
lists would help at least to get a general idea of the initial social 
environment.

Alexander Doria / PCL

Le 24 juil. 2018 à 12:04, Juliana Bastos Marques 
mailto:domusau...@gmail.com>> a écrit :

One other thing to consider is the specifics of how a language
group/culture deals with collaborative work. I have no idea how to tackle
this, though I've seen some studies in that direction.

I'm sure some of you here have heard about the absolute mess and
conflict-ridden Portuguese Wikipedia. It's packed with hard deletionists,
very hostile to newcomers and split into groups constantly fighting for
power. I'm sure that's part of why PT:WP isn't bigger.

Juliana

On Tue, Jul 24, 2018 at 10:53 AM, Amir E. Aharoni <
amir.ahar...@mail.huji.ac.il> wrote:

Very interesting and much-needee research. Thanks for doing this. I'd love
to see the results and even the process.

Some things to consider:
1. How long is the tradition of having published encyclopedias in that
culture?
2. Alphabet: Using a common alphabet may make it somewhat easier to
translate information between languages that use it, especially for things
like towns and biographies. The Korean alphabet is used only by one
language, but the Latin and the Cyrillic alphabets are used by many (with
variations).
3. How long is the tradition of *actually* having public education for
everybody: rich and poor, cities and villages? By "actually" I mean "not
just by law, but in practice".
4. How long is the tradition of mostly-universal literacy? ("Literacy" is
one of the most fuzzily defined concepts. Here I refer to something like
"being able to read a newspaper and to write a one-page letter in one's own
native language".)
5. How long is the tradition of having public libraries in most towns and
villages?
6. How common is it to know other languages?
7. How isolated or open is the society that speaks this 

Re: [Wiki-research-l] Wiki-research-l] Country (culture...) as a factor in contributing to collective intelligence projects

2018-07-24 Thread Juliana Bastos Marques
Regarding featured articles, I conducted a small study (should be out in
Oct.) on the Portuguese Wikipedia about those related to Ancient History.
Although the sample was obviously small, my findings were clear and
confirmed by many admins later: most articles are translations/new material
made by a very small group of frequent editors, who use their stats to
legitimate power as admins. Again here, cultural issues pair with specific
community behavior.

Great material, Dariusz, thanks for sharing!

Juliana

On Tue, Jul 24, 2018 at 7:17 PM, Dariusz Jemielniak 
wrote:

> on a slightly related note, I analyzed the cultural preferences for image,
> references, links, word count etc. saturation in good and featured articles
> on 8 wikis and found significant cultural variation:
>
> http://crow.kozminski.edu.pl/papers/cultures%20of%20wikipedias.pdf
>
> best,
>
> dj
>
> On Tue, Jul 24, 2018 at 7:17 PM, Peter Meyer  wrote:
>
> > Interesting topic!   Here is a useful analogy regarding the distribution
> > of sizes.  There has been study of how big cities are within countries or
> > worldwide, and there are recurring patterns of the scale of the largest
> to
> > the second largest, and the second-largest to the third, and so forth.
> >
> > Without getting into this too deeply you might at least check if the size
> > relations among Wikipedias are like those of cities, that is, if they
> have
> > a similar-looking distribution.  If they do, the underlying forces and
> > dynamics for city sizes might also apply to wikipediae or other sites.
> >
> > The math is described by Zipf’s law and/or Gibrat’s distribution.
> > https://en.wikipedia.org/wiki/Zipf%27s_law  > wiki/Zipf's_law>, and https://en.wikipedia.org/wiki/Gibrat%27s_law <
> > https://en.wikipedia.org/wiki/Gibrat's_law>.  The work by Xavier Gabaix,
> > cited there, was my introduction to it.
> >
> > Like the choice of what city to move to, the relevant Wikipedias for a
> > user will usually need to be “close” — geographically for a city, or to
> the
> > languages the user knows for a Wikipedia.  There are other factors
> driving
> > a user’s choice, if we think of the user as choosing.  If the user wishes
> > to study an obscure academic subject, they may have to use a large
> > wikipedia, and that drives them to also participate there.  If the user
> is
> > focused on a geographically local subject, that drives the choice.  A
> > larger wikipedia is more useful than a small one, therefore the
> > distribution of wikipedia sizes would be more unequal than the
> distribution
> > of personal languages.
> >
> > It sounds like, based on Poland and Korea, you can show that Internet
> > availability is not driving all the difference.  Good to know.  — peter
> > meyer
> >
> >
> > > On Jul 24, 2018, at 11:30 AM, James Salsman 
> wrote:
> > >
> > >> Why do you think different language Wikipedia's have different
> > >> sizes, outside of the popularity of a given language?
> > >
> > > Piotr, if you model organic editing production with a Poisson
> > > distribution, which is reasonable for a first approximation, 3x+
> > > disparities are just natural for the same population sizes:
> > >
> > > https://en.wikipedia.org/wiki/Poisson_distribution
> > >
> > > I'm not sure the images in that article capture the wide platykurtosis
> > > of large Poisson distributions.
> > >
> > > Best regards,
> > > Jim
> > >
> > > ___
> > > Wiki-research-l mailing list
> > > Wiki-research-l@lists.wikimedia.org
> > > https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
> >
> > ___
> > Wiki-research-l mailing list
> > Wiki-research-l@lists.wikimedia.org
> > https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
> >
>
>
>
> --
> 
>  prof. dr hab. Dariusz Jemielniak
> kierownik katedry MINDS (Management in Networked and Digital Societies)
> Akademia Leona Koźmińskiego
> http://NeRDS.kozminski.edu.pl  
>
>
>
> *Ostatnie artykuły:*
>
>- Dariusz Jemielniak, Maciej Wilamowski (2017)  Cultural Diversity of
>Quality of Information on Wikipedias
>
> *Journal
>of the Association for Information Science and Technology* 68:  10.
> 2460–2470.
>- Dariusz Jemielniak (2016)  Wikimedia Movement Governance: The Limits
>of A-Hierarchical Organization
>
> *Journal
>of Organizational Change Management *29:  3.  361-378.
>- Dariusz Jemielniak, Eduard Aibar (2016)  Bridging the Gap Between
>Wikipedia and Academia
> *Journal of the
>Association for Information Science and Technology* 67:  7.  1773-1776.
>- Dariusz Jemielniak (2016)  Bre

Re: [Wiki-research-l] Wiki-research-l] Country (culture...) as a factor in contributing to collective intelligence projects

2018-07-24 Thread Dariusz Jemielniak
on a slightly related note, I analyzed the cultural preferences for image,
references, links, word count etc. saturation in good and featured articles
on 8 wikis and found significant cultural variation:

http://crow.kozminski.edu.pl/papers/cultures%20of%20wikipedias.pdf

best,

dj

On Tue, Jul 24, 2018 at 7:17 PM, Peter Meyer  wrote:

> Interesting topic!   Here is a useful analogy regarding the distribution
> of sizes.  There has been study of how big cities are within countries or
> worldwide, and there are recurring patterns of the scale of the largest to
> the second largest, and the second-largest to the third, and so forth.
>
> Without getting into this too deeply you might at least check if the size
> relations among Wikipedias are like those of cities, that is, if they have
> a similar-looking distribution.  If they do, the underlying forces and
> dynamics for city sizes might also apply to wikipediae or other sites.
>
> The math is described by Zipf’s law and/or Gibrat’s distribution.
> https://en.wikipedia.org/wiki/Zipf%27s_law  wiki/Zipf's_law>, and https://en.wikipedia.org/wiki/Gibrat%27s_law <
> https://en.wikipedia.org/wiki/Gibrat's_law>.  The work by Xavier Gabaix,
> cited there, was my introduction to it.
>
> Like the choice of what city to move to, the relevant Wikipedias for a
> user will usually need to be “close” — geographically for a city, or to the
> languages the user knows for a Wikipedia.  There are other factors driving
> a user’s choice, if we think of the user as choosing.  If the user wishes
> to study an obscure academic subject, they may have to use a large
> wikipedia, and that drives them to also participate there.  If the user is
> focused on a geographically local subject, that drives the choice.  A
> larger wikipedia is more useful than a small one, therefore the
> distribution of wikipedia sizes would be more unequal than the distribution
> of personal languages.
>
> It sounds like, based on Poland and Korea, you can show that Internet
> availability is not driving all the difference.  Good to know.  — peter
> meyer
>
>
> > On Jul 24, 2018, at 11:30 AM, James Salsman  wrote:
> >
> >> Why do you think different language Wikipedia's have different
> >> sizes, outside of the popularity of a given language?
> >
> > Piotr, if you model organic editing production with a Poisson
> > distribution, which is reasonable for a first approximation, 3x+
> > disparities are just natural for the same population sizes:
> >
> > https://en.wikipedia.org/wiki/Poisson_distribution
> >
> > I'm not sure the images in that article capture the wide platykurtosis
> > of large Poisson distributions.
> >
> > Best regards,
> > Jim
> >
> > ___
> > Wiki-research-l mailing list
> > Wiki-research-l@lists.wikimedia.org
> > https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
>
> ___
> Wiki-research-l mailing list
> Wiki-research-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
>



-- 

 prof. dr hab. Dariusz Jemielniak
kierownik katedry MINDS (Management in Networked and Digital Societies)
Akademia Leona Koźmińskiego
http://NeRDS.kozminski.edu.pl  



*Ostatnie artykuły:*

   - Dariusz Jemielniak, Maciej Wilamowski (2017)  Cultural Diversity of
   Quality of Information on Wikipedias
   
*Journal
   of the Association for Information Science and Technology* 68:  10.
2460–2470.
   - Dariusz Jemielniak (2016)  Wikimedia Movement Governance: The Limits
   of A-Hierarchical Organization
    *Journal
   of Organizational Change Management *29:  3.  361-378.
   - Dariusz Jemielniak, Eduard Aibar (2016)  Bridging the Gap Between
   Wikipedia and Academia
    *Journal of the
   Association for Information Science and Technology* 67:  7.  1773-1776.
   - Dariusz Jemielniak (2016)  Breaking the Glass Ceiling on Wikipedia
    *Feminist
   Review *113:  1.  103-108.
   - Tadeusz Chełkowski, Peter Gloor, Dariusz Jemielniak (2016)  Inequalities
   in Open Source Software Development: Analysis of Contributor’s Commits in
   Apache Software Foundation Projects
   

   , *PLoS ONE* 11:  4.  e0152976.
___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


Re: [Wiki-research-l] Country (culture...) as a factor in contributing to collective intelligence projects

2018-07-24 Thread Peter Meyer
Along this line I saw a terrific study recently looking at patent coauthors.  
Patents can be filed by individuals or by multiple individuals, and if people 
work together on patents in different groups this builds “networks” among 
inventors, in which they have previous coauthorship links.  If patents are 
filed only by single individuals there might be just as many inventions, but 
the networks are not built together as much.

The study looked at patents in Sweden and Spain in the 19th century.  It is by 
David Andersson and Patricio Saiz who are experts in the patent data from these 
countries.  They found the Swedish patents were likely to be coauthored, and 
the Spanish ones were not.  They looked at the resulting network links.  They 
argue that it led to more industrialization and growth in the Swedish case than 
in the Spanish case. 

This is very helpful and insightful I thought.  it was kind of gripping because 
they make a connection over the course of 100 years, in which the individuals 
from the early period are no longer relevant in the later period; it is an 
assertion about a long-lasting property.

Is this from a more cooperative culture in one place, and the opportunity for 
such networks to industrialize using later technologies?  Or, is it a result of 
different industries naturally springing up in the different countries?  Not 
entirely clear.

However the link to a fundamentally flexible cooperative cultures that exist 
before wikipedia could explain the differences in growth.   This is one paper 
to analogize to.  Maybe the places where patents are most coauthored also 
generate larger decentralized/cooperative works.

> On Jul 24, 2018, at 8:19 AM, Pierre-Carl Langlais 
>  wrote:
> 
> This is a very interesting project.
> 
> Just in short remark in line with Juliana’s observation: the hardest part 
> would be to account for the specific "inner" culture developed by each 
> wikimedian communities. Since most of them has started on a relatively small 
> scale, numerous norms and lasting social dynamics can be explained by the 
> initial choices / tastes of a limited set of individuals. Of course, they may 
> in turn result from a wider cultural background but also may be simply 
> idiosyncratic.
> 
> I guess discriminating this factor would be quite hard. Perhaps using 
> contributing data (when they exist) in the dumps and the archives of mailing 
> lists would help at least to get a general idea of the initial social 
> environment.
> 
> Alexander Doria / PCL
> 
>> Le 24 juil. 2018 à 12:04, Juliana Bastos Marques  a 
>> écrit :
>> 
>> One other thing to consider is the specifics of how a language
>> group/culture deals with collaborative work. I have no idea how to tackle
>> this, though I've seen some studies in that direction.
>> 
>> I'm sure some of you here have heard about the absolute mess and
>> conflict-ridden Portuguese Wikipedia. It's packed with hard deletionists,
>> very hostile to newcomers and split into groups constantly fighting for
>> power. I'm sure that's part of why PT:WP isn't bigger.
>> 
>> Juliana
>> 
>> On Tue, Jul 24, 2018 at 10:53 AM, Amir E. Aharoni <
>> amir.ahar...@mail.huji.ac.il> wrote:
>> 
>>> Very interesting and much-needee research. Thanks for doing this. I'd love
>>> to see the results and even the process.
>>> 
>>> Some things to consider:
>>> 1. How long is the tradition of having published encyclopedias in that
>>> culture?
>>> 2. Alphabet: Using a common alphabet may make it somewhat easier to
>>> translate information between languages that use it, especially for things
>>> like towns and biographies. The Korean alphabet is used only by one
>>> language, but the Latin and the Cyrillic alphabets are used by many (with
>>> variations).
>>> 3. How long is the tradition of *actually* having public education for
>>> everybody: rich and poor, cities and villages? By "actually" I mean "not
>>> just by law, but in practice".
>>> 4. How long is the tradition of mostly-universal literacy? ("Literacy" is
>>> one of the most fuzzily defined concepts. Here I refer to something like
>>> "being able to read a newspaper and to write a one-page letter in one's own
>>> native language".)
>>> 5. How long is the tradition of having public libraries in most towns and
>>> villages?
>>> 6. How common is it to know other languages?
>>> 7. How isolated or open is the society that speaks this language in terms
>>> of access to media from other countries, translation of literature from
>>> other languages, travel to other countries?
>>> 8. How widespread are basic computer literacy skills: using a web browser;
>>> sending an email; copying, down/uploading, and deleting files.
>>> 9. How long is the tradition of having language resources, such as
>>> dictionaries, spelling standards, thesauri, style guides?
>>> 10. Is the language used completely in public education for teaching,
>>> textbooks, and homework? Or is the education mostly done in a foreign
>>> language? (Th

Re: [Wiki-research-l] Wiki-research-l] Country (culture...) as a factor in contributing to collective intelligence projects

2018-07-24 Thread Peter Meyer
Interesting topic!   Here is a useful analogy regarding the distribution of 
sizes.  There has been study of how big cities are within countries or 
worldwide, and there are recurring patterns of the scale of the largest to the 
second largest, and the second-largest to the third, and so forth.

Without getting into this too deeply you might at least check if the size 
relations among Wikipedias are like those of cities, that is, if they have a 
similar-looking distribution.  If they do, the underlying forces and dynamics 
for city sizes might also apply to wikipediae or other sites.

The math is described by Zipf’s law and/or Gibrat’s distribution.  
https://en.wikipedia.org/wiki/Zipf%27s_law 
, and 
https://en.wikipedia.org/wiki/Gibrat%27s_law 
.  The work by Xavier Gabaix, cited 
there, was my introduction to it.

Like the choice of what city to move to, the relevant Wikipedias for a user 
will usually need to be “close” — geographically for a city, or to the 
languages the user knows for a Wikipedia.  There are other factors driving a 
user’s choice, if we think of the user as choosing.  If the user wishes to 
study an obscure academic subject, they may have to use a large wikipedia, and 
that drives them to also participate there.  If the user is focused on a 
geographically local subject, that drives the choice.  A larger wikipedia is 
more useful than a small one, therefore the distribution of wikipedia sizes 
would be more unequal than the distribution of personal languages.

It sounds like, based on Poland and Korea, you can show that Internet 
availability is not driving all the difference.  Good to know.  — peter meyer


> On Jul 24, 2018, at 11:30 AM, James Salsman  wrote:
> 
>> Why do you think different language Wikipedia's have different
>> sizes, outside of the popularity of a given language?
> 
> Piotr, if you model organic editing production with a Poisson
> distribution, which is reasonable for a first approximation, 3x+
> disparities are just natural for the same population sizes:
> 
> https://en.wikipedia.org/wiki/Poisson_distribution
> 
> I'm not sure the images in that article capture the wide platykurtosis
> of large Poisson distributions.
> 
> Best regards,
> Jim
> 
> ___
> Wiki-research-l mailing list
> Wiki-research-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l

___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


Re: [Wiki-research-l] Wiki-research-l] Country (culture...) as a factor in contributing to collective intelligence projects

2018-07-24 Thread James Salsman
> Why do you think different language Wikipedia's have different
> sizes, outside of the popularity of a given language?

Piotr, if you model organic editing production with a Poisson
distribution, which is reasonable for a first approximation, 3x+
disparities are just natural for the same population sizes:

https://en.wikipedia.org/wiki/Poisson_distribution

I'm not sure the images in that article capture the wide platykurtosis
of large Poisson distributions.

Best regards,
Jim

___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


Re: [Wiki-research-l] Country (culture...) as a factor in contributing to collective intelligence projects

2018-07-24 Thread Pierre-Carl Langlais
This is a very interesting project.

Just in short remark in line with Juliana’s observation: the hardest part would 
be to account for the specific "inner" culture developed by each wikimedian 
communities. Since most of them has started on a relatively small scale, 
numerous norms and lasting social dynamics can be explained by the initial 
choices / tastes of a limited set of individuals. Of course, they may in turn 
result from a wider cultural background but also may be simply idiosyncratic.

I guess discriminating this factor would be quite hard. Perhaps using 
contributing data (when they exist) in the dumps and the archives of mailing 
lists would help at least to get a general idea of the initial social 
environment.

Alexander Doria / PCL

> Le 24 juil. 2018 à 12:04, Juliana Bastos Marques  a 
> écrit :
> 
> One other thing to consider is the specifics of how a language
> group/culture deals with collaborative work. I have no idea how to tackle
> this, though I've seen some studies in that direction.
> 
> I'm sure some of you here have heard about the absolute mess and
> conflict-ridden Portuguese Wikipedia. It's packed with hard deletionists,
> very hostile to newcomers and split into groups constantly fighting for
> power. I'm sure that's part of why PT:WP isn't bigger.
> 
> Juliana
> 
> On Tue, Jul 24, 2018 at 10:53 AM, Amir E. Aharoni <
> amir.ahar...@mail.huji.ac.il> wrote:
> 
>> Very interesting and much-needee research. Thanks for doing this. I'd love
>> to see the results and even the process.
>> 
>> Some things to consider:
>> 1. How long is the tradition of having published encyclopedias in that
>> culture?
>> 2. Alphabet: Using a common alphabet may make it somewhat easier to
>> translate information between languages that use it, especially for things
>> like towns and biographies. The Korean alphabet is used only by one
>> language, but the Latin and the Cyrillic alphabets are used by many (with
>> variations).
>> 3. How long is the tradition of *actually* having public education for
>> everybody: rich and poor, cities and villages? By "actually" I mean "not
>> just by law, but in practice".
>> 4. How long is the tradition of mostly-universal literacy? ("Literacy" is
>> one of the most fuzzily defined concepts. Here I refer to something like
>> "being able to read a newspaper and to write a one-page letter in one's own
>> native language".)
>> 5. How long is the tradition of having public libraries in most towns and
>> villages?
>> 6. How common is it to know other languages?
>> 7. How isolated or open is the society that speaks this language in terms
>> of access to media from other countries, translation of literature from
>> other languages, travel to other countries?
>> 8. How widespread are basic computer literacy skills: using a web browser;
>> sending an email; copying, down/uploading, and deleting files.
>> 9. How long is the tradition of having language resources, such as
>> dictionaries, spelling standards, thesauri, style guides?
>> 10. Is the language used completely in public education for teaching,
>> textbooks, and homework? Or is the education mostly done in a foreign
>> language? (This, roughly, is the situation in the Philippines and in many
>> African countries.)
>> 11. When did the language become an official language of a country? (If at
>> all.)
>> 12. Are there political, cultural, or government-suported movements for
>> language development or preservation?
>> 13. When did it become universally possible to fully write this language on
>> a computer, with complete keyboards and fonts support? E.g., English has
>> been easy to use on any computer for as long as there are computers;
>> Polish, German, Russian and many other languages have been supported for a
>> long time, but still struggled with encodings and diacritics in the 1990s;
>> India and Burma are still struggling; I'm not sure about Korea.
>> 
>> These are the immediate things I can think about. There are probably many
>> more criteria that could be considered.
>> 
>> The economics around a country are probably very important (poverty, access
>> to infrastructure, healthcare, etc.), and you mentioned in your first email
>> that you accounted for it, although I don't know in how much detail, so I
>> trust you on that :)
>> 
>> 
>> בתאריך 24 ביולי 2018 12:04,‏ "Piotr Konieczny"  כתב:
>> 
>> Dear all,
>> 
>> I am working on a paper on why/whether people contribute (or not) to
>> collective intelligence differently projects in different countries. The
>> paper was inspired, partially, by several discussions I had with various
>> people on why different language Wikipedia's have different sizes,
>> besides (doh) the popularity of the language (and yes, English is
>> biggest because it is international; and yes, I am aware a few
>> Wikipedias are outliers because of bots creating machine translations or
>> auto-populating villages or such). But for example, Poland and South
>> Korea have roughly similar pop

Re: [Wiki-research-l] Country (culture...) as a factor in contributing to collective intelligence projects

2018-07-24 Thread Juliana Bastos Marques
One other thing to consider is the specifics of how a language
group/culture deals with collaborative work. I have no idea how to tackle
this, though I've seen some studies in that direction.

I'm sure some of you here have heard about the absolute mess and
conflict-ridden Portuguese Wikipedia. It's packed with hard deletionists,
very hostile to newcomers and split into groups constantly fighting for
power. I'm sure that's part of why PT:WP isn't bigger.

Juliana

On Tue, Jul 24, 2018 at 10:53 AM, Amir E. Aharoni <
amir.ahar...@mail.huji.ac.il> wrote:

> Very interesting and much-needee research. Thanks for doing this. I'd love
> to see the results and even the process.
>
> Some things to consider:
> 1. How long is the tradition of having published encyclopedias in that
> culture?
> 2. Alphabet: Using a common alphabet may make it somewhat easier to
> translate information between languages that use it, especially for things
> like towns and biographies. The Korean alphabet is used only by one
> language, but the Latin and the Cyrillic alphabets are used by many (with
> variations).
> 3. How long is the tradition of *actually* having public education for
> everybody: rich and poor, cities and villages? By "actually" I mean "not
> just by law, but in practice".
> 4. How long is the tradition of mostly-universal literacy? ("Literacy" is
> one of the most fuzzily defined concepts. Here I refer to something like
> "being able to read a newspaper and to write a one-page letter in one's own
> native language".)
> 5. How long is the tradition of having public libraries in most towns and
> villages?
> 6. How common is it to know other languages?
> 7. How isolated or open is the society that speaks this language in terms
> of access to media from other countries, translation of literature from
> other languages, travel to other countries?
> 8. How widespread are basic computer literacy skills: using a web browser;
> sending an email; copying, down/uploading, and deleting files.
> 9. How long is the tradition of having language resources, such as
> dictionaries, spelling standards, thesauri, style guides?
> 10. Is the language used completely in public education for teaching,
> textbooks, and homework? Or is the education mostly done in a foreign
> language? (This, roughly, is the situation in the Philippines and in many
> African countries.)
> 11. When did the language become an official language of a country? (If at
> all.)
> 12. Are there political, cultural, or government-suported movements for
> language development or preservation?
> 13. When did it become universally possible to fully write this language on
> a computer, with complete keyboards and fonts support? E.g., English has
> been easy to use on any computer for as long as there are computers;
> Polish, German, Russian and many other languages have been supported for a
> long time, but still struggled with encodings and diacritics in the 1990s;
> India and Burma are still struggling; I'm not sure about Korea.
>
> These are the immediate things I can think about. There are probably many
> more criteria that could be considered.
>
> The economics around a country are probably very important (poverty, access
> to infrastructure, healthcare, etc.), and you mentioned in your first email
> that you accounted for it, although I don't know in how much detail, so I
> trust you on that :)
>
>
> בתאריך 24 ביולי 2018 12:04,‏ "Piotr Konieczny"  כתב:
>
> Dear all,
>
> I am working on a paper on why/whether people contribute (or not) to
> collective intelligence differently projects in different countries. The
> paper was inspired, partially, by several discussions I had with various
> people on why different language Wikipedia's have different sizes,
> besides (doh) the popularity of the language (and yes, English is
> biggest because it is international; and yes, I am aware a few
> Wikipedias are outliers because of bots creating machine translations or
> auto-populating villages or such). But for example, Poland and South
> Korea have roughly similar population/speakers and development status,
> yet Polish Wikipedia is over 3x the size of the SK one and no bot can
> account for that. So, there's more to that. I am already feeding dozens
> of parameters to a spreadsheet for some modelling, but I a) wonder what
> I might have missed - before a reviewer asks 'why didn't you check for
> xyz' and b) would like to have a few nice sentences about how things
> that people expect to matter do not (or vice versa). Hence, my question
> to you all, in the form of this open question mini survey:
>
> Why do you think different language Wikipedia's have different sizes,
> outside of the popularity of a given language?
>
> For reference, list of Wikipedias by size and language:
> https://meta.wikimedia.org/wiki/List_of_Wikipedias
>
> TIA!
>
>
> --
> Piotr Konieczny, PhD
> http://hanyang.academia.edu/PiotrKonieczny
> http://scholar.google.com/citations?user=gdV8_AEJ
> http://e

Re: [Wiki-research-l] Country (culture...) as a factor in contributing to collective intelligence projects

2018-07-24 Thread Amir E. Aharoni
Very interesting and much-needee research. Thanks for doing this. I'd love
to see the results and even the process.

Some things to consider:
1. How long is the tradition of having published encyclopedias in that
culture?
2. Alphabet: Using a common alphabet may make it somewhat easier to
translate information between languages that use it, especially for things
like towns and biographies. The Korean alphabet is used only by one
language, but the Latin and the Cyrillic alphabets are used by many (with
variations).
3. How long is the tradition of *actually* having public education for
everybody: rich and poor, cities and villages? By "actually" I mean "not
just by law, but in practice".
4. How long is the tradition of mostly-universal literacy? ("Literacy" is
one of the most fuzzily defined concepts. Here I refer to something like
"being able to read a newspaper and to write a one-page letter in one's own
native language".)
5. How long is the tradition of having public libraries in most towns and
villages?
6. How common is it to know other languages?
7. How isolated or open is the society that speaks this language in terms
of access to media from other countries, translation of literature from
other languages, travel to other countries?
8. How widespread are basic computer literacy skills: using a web browser;
sending an email; copying, down/uploading, and deleting files.
9. How long is the tradition of having language resources, such as
dictionaries, spelling standards, thesauri, style guides?
10. Is the language used completely in public education for teaching,
textbooks, and homework? Or is the education mostly done in a foreign
language? (This, roughly, is the situation in the Philippines and in many
African countries.)
11. When did the language become an official language of a country? (If at
all.)
12. Are there political, cultural, or government-suported movements for
language development or preservation?
13. When did it become universally possible to fully write this language on
a computer, with complete keyboards and fonts support? E.g., English has
been easy to use on any computer for as long as there are computers;
Polish, German, Russian and many other languages have been supported for a
long time, but still struggled with encodings and diacritics in the 1990s;
India and Burma are still struggling; I'm not sure about Korea.

These are the immediate things I can think about. There are probably many
more criteria that could be considered.

The economics around a country are probably very important (poverty, access
to infrastructure, healthcare, etc.), and you mentioned in your first email
that you accounted for it, although I don't know in how much detail, so I
trust you on that :)


בתאריך 24 ביולי 2018 12:04,‏ "Piotr Konieczny"  כתב:

Dear all,

I am working on a paper on why/whether people contribute (or not) to
collective intelligence differently projects in different countries. The
paper was inspired, partially, by several discussions I had with various
people on why different language Wikipedia's have different sizes,
besides (doh) the popularity of the language (and yes, English is
biggest because it is international; and yes, I am aware a few
Wikipedias are outliers because of bots creating machine translations or
auto-populating villages or such). But for example, Poland and South
Korea have roughly similar population/speakers and development status,
yet Polish Wikipedia is over 3x the size of the SK one and no bot can
account for that. So, there's more to that. I am already feeding dozens
of parameters to a spreadsheet for some modelling, but I a) wonder what
I might have missed - before a reviewer asks 'why didn't you check for
xyz' and b) would like to have a few nice sentences about how things
that people expect to matter do not (or vice versa). Hence, my question
to you all, in the form of this open question mini survey:

Why do you think different language Wikipedia's have different sizes,
outside of the popularity of a given language?

For reference, list of Wikipedias by size and language:
https://meta.wikimedia.org/wiki/List_of_Wikipedias

TIA!


-- 
Piotr Konieczny, PhD
http://hanyang.academia.edu/PiotrKonieczny
http://scholar.google.com/citations?user=gdV8_AEJ
http://en.wikipedia.org/wiki/User:Piotrus


___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


Re: [Wiki-research-l] Country (culture...) as a factor in contributing to collective intelligence projects

2018-07-24 Thread Lucie-Aimée Kaffee
Hi Piotr,

I would look into things such as distribution (is there one region of the
world Wikipedia is used more in general) and alternative projects (such as
Chinese Baidu) that might be more popular for people speaking the language.
And there might be some aspect to people living abroad editing their
language Wikipedia, but that's just speculating. Somewhat along the lines,
if people from language one move to places where language two is spoken,
and language two has a big Wikipedia already, it might be a motivating
factor to edit the other language more as well.

Best,
Lucie

On 24 July 2018 at 09:02, Piotr Konieczny  wrote:

> Dear all,
>
> I am working on a paper on why/whether people contribute (or not) to
> collective intelligence differently projects in different countries. The
> paper was inspired, partially, by several discussions I had with various
> people on why different language Wikipedia's have different sizes,
> besides (doh) the popularity of the language (and yes, English is
> biggest because it is international; and yes, I am aware a few
> Wikipedias are outliers because of bots creating machine translations or
> auto-populating villages or such). But for example, Poland and South
> Korea have roughly similar population/speakers and development status,
> yet Polish Wikipedia is over 3x the size of the SK one and no bot can
> account for that. So, there's more to that. I am already feeding dozens
> of parameters to a spreadsheet for some modelling, but I a) wonder what
> I might have missed - before a reviewer asks 'why didn't you check for
> xyz' and b) would like to have a few nice sentences about how things
> that people expect to matter do not (or vice versa). Hence, my question
> to you all, in the form of this open question mini survey:
>
> Why do you think different language Wikipedia's have different sizes,
> outside of the popularity of a given language?
>
> For reference, list of Wikipedias by size and language:
> https://meta.wikimedia.org/wiki/List_of_Wikipedias
>
> TIA!
>
> --
> Piotr Konieczny, PhD
> http://hanyang.academia.edu/PiotrKonieczny
> http://scholar.google.com/citations?user=gdV8_AEJ
> http://en.wikipedia.org/wiki/User:Piotrus
>
>
> ___
> Wiki-research-l mailing list
> Wiki-research-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
>



-- 
Lucie-Aimée Kaffee
Web and Internet Science Group
School of Electronics and Computer Science
University of Southampton
___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


[Wiki-research-l] Country (culture...) as a factor in contributing to collective intelligence projects

2018-07-24 Thread Piotr Konieczny

Dear all,

I am working on a paper on why/whether people contribute (or not) to 
collective intelligence differently projects in different countries. The 
paper was inspired, partially, by several discussions I had with various 
people on why different language Wikipedia's have different sizes, 
besides (doh) the popularity of the language (and yes, English is 
biggest because it is international; and yes, I am aware a few 
Wikipedias are outliers because of bots creating machine translations or 
auto-populating villages or such). But for example, Poland and South 
Korea have roughly similar population/speakers and development status, 
yet Polish Wikipedia is over 3x the size of the SK one and no bot can 
account for that. So, there's more to that. I am already feeding dozens 
of parameters to a spreadsheet for some modelling, but I a) wonder what 
I might have missed - before a reviewer asks 'why didn't you check for 
xyz' and b) would like to have a few nice sentences about how things 
that people expect to matter do not (or vice versa). Hence, my question 
to you all, in the form of this open question mini survey:


Why do you think different language Wikipedia's have different sizes, 
outside of the popularity of a given language?


For reference, list of Wikipedias by size and language: 
https://meta.wikimedia.org/wiki/List_of_Wikipedias


TIA!

--
Piotr Konieczny, PhD
http://hanyang.academia.edu/PiotrKonieczny
http://scholar.google.com/citations?user=gdV8_AEJ
http://en.wikipedia.org/wiki/User:Piotrus


___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l