Re: [Wiki-research-l] Data on editathons held in each Wikipedia Language?

Kerry Raymond Tue, 08 Dec 2015 15:43:10 -0800

I would have to say that it might be realistic to gather statistics for a small 
event held in one location but for any large event, with multiple physical 
locations and/or online participation, it would be very difficult. Having been 
involved in supporting real-world events, I am well aware that many people 
believe the organisers have nothing else to do but gather statistics. In fact, 
you are running around like a headless chook the whole day because there are so 
many things to be done to run the event at all and there are usually too few 
organisers/helpers relative to the number of mostly newbie participants, so 
statistics gathering is the last thing on your mind.

Also some people are contributing because they are participants of the 
editathon but there are also contributions (both helpful and unhelpful) from 
other members of the community who are just reacting in their usual way to 
Wikipedia contributions and may not regard themselves as part of the editathon 
and/or may be completely unaware of it. 

As a concrete example, Wikibomb 2014 (an editathon aimed at creating articles 
for Australian female scientists selected by the Australian Academy of Science) 
had multiple physical sites in different cities plus on-line participants and 
took place on a single day. (I was at one of the physical locations and too 
busy to count the participants but around 30  people). We asked that 
participants add the category

https://en.wikipedia.org/wiki/Category:Wikibomb2014

to their articles (but obviously we cannot be sure if they did, most were new 
to Wikipedia editing and may not have even understood what we were asking them 
to do). However, using the category, we do have a set of 118 articles that we 
know were created as part of the event (although it may be that some were 
created in advance or after the event but still used the category, but 
presumably were part of the event in terms of intent) and some may have been 
deleted subsequently (we had issues with sources being the university or 
research institute employing the scientist so perhaps questionable as to their 
independence, plus we had copyvios where bios from university websites were 
copy-and-pasted etc).

>From that set of 118 articles, you can probably analyse their edit histories 
>and find the list of contributors in the first day or so, which should pick up 
>most of the event participants (but also some others). You cannot rely on the 
>first edit being the original participant. I often did the first edit to 
>create the article if people were being diverted into Article for Creation 
>(tip: never use AfC at an event, the success of an event needs immediately 
>visible articles at the end of the day which is not possible with AfC), so 
>first edit may be done by experienced editors as a matter of practicality. But 
>with a certain amount of visible inspection, you would probably be able to 
>identify the person who contributed the most article text on that day and that 
>person would probably be a participant for the event. You might be able to 
>automate that.

Kerry

From: Wiki-research-l [mailto:wiki-research-l-boun...@lists.wikimedia.org] On 
Behalf Of Jonathan Morgan
Sent: Wednesday, 9 December 2015 3:46 AM
To: Research into Wikimedia content and communities 
<wiki-research-l@lists.wikimedia.org>
Cc: Harsh Gupta <gupta.hars...@gmail.com>
Subject: Re: [Wiki-research-l] Data on editathons held in each Wikipedia 
Language?

I don't personally know of any central repository for data on past 
edit-a-thons. 

There might be something out there. You could probably get some information 
from pinging folks in CE who've worked on Project & Event Grants (Asaf Bartov, 
Kacie Harold) or Program Evaluation (Amanda Bittaker, Edward Galvez), or search 
through past grant reports... but I'm guessing the data will be sparse and 
inconsistent, as it is still collected in a somewhat ad-hoc fashion.

If WMF were to support the development and maintenance of standardized 
infrastructure for edit-a-thon tracking--something like Harsh Kothari and Jeph 
Paul's platform for the Indian Wikiwomen edit-a-thons (site 
<http://2015.wikiwomen.in/> , code 
<https://github.com/cosmiclattes/wikiwomen/tree/master> )--this would be 
easier. But AFAIK that hasn't happened. If someone takes up that cause I will 
voice my support. 

J

On Mon, Dec 7, 2015 at 7:34 PM, Maximilian Klein <isa...@gmail.com 
<mailto:isa...@gmail.com> > wrote:

Researchians,

I have a been collecting data on the gendered biographies of different 
Wikipedia Languages from Wikidata dumps, with the question of trying to 
understand the gender gap in content. After reading about Propensity Score 
Matching[1] today, I see it would be possible to test a (close to) causal link 
between the genders of Wikipedia Biographies being added to a language, and 
Editathon activity. Yet we'd need the data for editathon activity. Is it 
compiled somewhere, or can you think of how it could be compiled?

[1] https://en.wikipedia.org/wiki/Propensity_score_matching The idea in 
propensity score matching is to pretend a randomized experiment is being 
conducted, and to find a "control group" - a similar but untreated language, 
for each "treated group".

Make a great day,
Max Klein ‽ http://notconfusing.com/ 

_______________________________________________
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org 
<mailto:Wiki-research-l@lists.wikimedia.org> 
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l

-- 

Jonathan T. Morgan

Senior Design Researcher

Wikimedia Foundation

User:Jmorgan (WMF) <https://meta.wikimedia.org/wiki/User:Jmorgan_(WMF)>

_______________________________________________
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l

Re: [Wiki-research-l] Data on editathons held in each Wikipedia Language?

Reply via email to