Read the rest :P On Jun 13, 2015 02:43, "Asaf Bartov" <abar...@wikimedia.org> wrote:
> (adding Analytics, as a relevant group for this discussion.) > > I think this is next to meaningless, because the differing bot policies and > practices on different wikis skew the data into incoherence. > > The (already existing) metric of active-editors-per-million-speakers is, it > seems to me, a far more robust metric. Erik Z.'s stats.wikimedia.org is > offering that metric. > > A. > > On Sun, Jun 7, 2015 at 3:23 PM, Milos Rancic <mill...@gmail.com> wrote: > > > When you get data, at some point of time you start thinking about > > quite fringe comparisons. But that could actually give some useful > > conclusions, like this time it did [1]. > > > > We did the next: > > * Used the number of primary speakers from Ethnologue. (Erik Zachte is > > using approximate number of primary + secondary speakers; that could > > be good for correction of this data.) > > * Categorized languages according to the logarithmic number of > > speakers: >=10k, >=100k, >=1M, >=10M, >=100M. > > * Took the number of articles of Wikipedia in particular language and > > created ration (number of articles / number of speakers). > > * This list is consisted just of languages with Ethnologue status 1 > > (national), 2 (provincial) or 3 (wider communication). In fact, we > > have a lot of projects (more than 100) with worse language status; a > > number of them are actually threatened or even on the edge of > > extinction. > > > > Those are the preliminary results and I will definitely have to pass > > through all the numbers. I fixed manually some serious errors, like > > not having English Wikipedia itself inside of data :D > > > > Putting the languages into the logarithmic categories proved to be > > useful, as we are now able to compare the Wikipedias according to > > their gross capacity (numbers of speakers). I suppose somebody well > > introduced into statistics could even create the function which could > > be used to check how good one project stays, no matter of those strict > > categories. > > > > It's obvious that as more speakers one language has, it's harder to > > the community to follow the ratio. > > > > So, the winners per category are: > > 1) >= 1k: Hawaiian, ratio 0.96900 > > 2) >= 10k: Mirandese, ratio 0.18073 > > 3) >= 100k: Basque, ratio 0.38061 > > 4) >= 1M: Swedish, ratio 0.21381 > > 5) >= 10M: Dutch, ratio 0.08305 > > 6) >= 100M: English, ratio 0.01447 > > > > However, keep in mind that we removed languages not inside categories > > 1, 2 or 3. That affected >=10k languages, as, for example, Upper > > Sorbian stays much better than Mirandese (0.67). (Will fix it while > > creating the full report. Obviously, in this case logarithmic > > categories of numbers of speakers are much more important than what's > > the state of the language.) > > > > It's obvious that we could draw the line between 1:1 for 1-10k > > speakers to 10:1 for >=100M speakers. But, again, I would like to get > > input of somebody more competent. > > > > One very important category is missing here and it's about the level > > of development of the speakers. That could be added: GDP/PPP per > > capita for spoken country or countries would be useful as measurement. > > And I suppose somebody with statistical knowledge would be able to > > give us the number which would have meaning "ability to create > > Wikipedia article". > > > > Completed in such way, we'd be able to measure the success of > > particular Wikimedia groups and organizations. OK. Articles per > > speaker are not the only way to do so, but we could use other > > parameters, as well: number of new/active/very active editors etc. And > > we could put it into time scale. > > > > I'll make some other results. And to remind: I'd like to have the > > formula to count "ability to create Wikipedia article" and then to > > produce "level of particular community success in creating Wikipedia > > articles". And, of course, to implement it for editors. > > > > [1] > > > https://docs.google.com/spreadsheets/d/1TYyhETevEJ5MhfRheRn-aGc4cs_6k45Gwk_ic14TXY4/edit?usp=sharing > > > > _______________________________________________ > > Wikimedia-l mailing list, guidelines at: > > https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines > > Wikimedia-l@lists.wikimedia.org > > Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, > > <mailto:wikimedia-l-requ...@lists.wikimedia.org?subject=unsubscribe> > > > > > -- > Asaf Bartov > Wikimedia Foundation <http://www.wikimediafoundation.org> > > Imagine a world in which every single human being can freely share in the > sum of all knowledge. Help us make it a reality! > https://donate.wikimedia.org > _______________________________________________ > Wikimedia-l mailing list, guidelines at: > https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines > Wikimedia-l@lists.wikimedia.org > Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, > <mailto:wikimedia-l-requ...@lists.wikimedia.org?subject=unsubscribe> _______________________________________________ Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, <mailto:wikimedia-l-requ...@lists.wikimedia.org?subject=unsubscribe>