[Wiki-research-l] Re: [Announcement] A new formal collaboration in Research

2023-02-15 Thread Su-Laine Brodsky
Hi Martin,

Regarding the concept of readability, the Knowledge Gap Taxonomy[1] uses the 
term very broadly. The Taxonomy has readability as one of only one of three 
components of “Accessibility”, and says that readability is about "Barriers for 
accessing or consuming information originating from content.” The gap addresses 
the important issue that some Wikipedia articles are difficult for their target 
audience to understand.

I’m not super-familiar with the scholarship around readability, but the concept 
has come up in some discussions that I’ve been in recently. It seems that 
scholars tend to use a more narrow definition of readability, e.g. "Readability 
is the extent to which each sentence reads naturally, while comprehensibility 
is the extent to which the text as a whole is easy to understand.”[2]

I’m not here to criticize the Taxonomy, but what it labels readability is what 
some researchers might call either text comprehensibiity or understandability.  
Readability is one of several factors that influence whether a reader will 
understand a piece of text.[3] To quantify progress in filling the relevant 
knowledge gap, research that looks at understandability holistically would be 
needed.

References:
1) 
https://upload.wikimedia.org/wikipedia/commons/9/9e/The_Knowledge_Gaps_Taxonomy_Summary-and-Motivation.pdf
 , p. 4
2) https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5998532/#bibr11-8755122517706978
3) https://www.ajol.info/index.php/spl/article/view/151787/141398

Cheers,
Su-Laine  (Wikipedia volunteer)


> On Feb 15, 2023, at 6:57 AM, Martin Gerlach  wrote:
> 
> Hi Samuel,
> thanks for your interest in this project.
> Following up on your question, I want to share some additional background:
> This work is part of our updated research roadmap to address knowledge gaps
> [1], specifically, developing methods to measure different knowledge gaps
> [2]. We have identified readability as one of the gaps in the taxonomy of
> knowledge gaps [3]. However, we currently do not have the tools to
> systematically measure readability of Wikipedia articles across languages.
> Therefore, we would like to develop and validate a multilingual approach to
> measuring readability. Furthermore, the community wishlist from the
> previous year contained a proposal for a tool to surface readability scores
> [4]; while acknowledging that this is a difficult task to scale to all
> languages in Wikipedia.
> Let me know if you have further comments, suggestions, or questions --
> happy to discuss in more detail.
> Best,
> Martin
> 
> 
> [1]
> https://diff.wikimedia.org/2022/04/21/a-new-research-roadmap-for-addressing-knowledge-gaps/
> [2]
> https://meta.wikimedia.org/wiki/Research:Knowledge_Gaps_3_Years_On#Measure_Knowledge_Gaps
> [3] https://meta.wikimedia.org/wiki/Research:Knowledge_Gaps_Index/Taxonomy
> [4] 
> https://meta.wikimedia.org/wiki/Community_Wishlist_Survey_2022/Bots_and_gadgets/Readability_scores_gadget
> 
> 
> On Tue, Feb 14, 2023 at 10:50 PM Samuel Klein  wrote:
> 
>> Fantastic.  What a great teamn to work with.
>> 
>> We definitely need multiple reading-levels for articles, which involves
>> some namespace & interface magic, and new norm settings around what is
>> possible.  Only a few language projects have managed to bolt this onto the
>> side of MediaWiki (though they include some excellent successes imo).
>> Where does that fit into the research-practice-MW-WP roadmap?
>> 
>> SJ
>> 
>> On Tue, Feb 14, 2023 at 12:13 PM Martin Gerlach 
>> wrote:
>> 
>>> Hi all,
>>> 
>>> The Research team at the Wikimedia Foundation has officially started a
>> new
>>> Formal Collaboration [1] with Indira Sen, Katrin Weller, and Mareike
>>> Wieland from GESIS – Leibniz Institute for the Social Sciences to work
>>> collaboratively on understanding perception of readability in Wikipedia
>> [2]
>>> as part of the Addressing Knowledge Gaps Program [3]. We are thankful to
>>> them for agreeing to spend their time and expertise on this project in
>> the
>>> coming year.
>>> 
>>> Here are a few pieces of information about this collaboration that we
>> would
>>> like to share with you:
>>> * We aim to keep the research documentation for this project in the
>>> corresponding research page on meta [2].
>>> * Research tasks are hard to break down and track in task-tracking
>> systems.
>>> This being said, the page on meta is linked to an Epic level Phabricator
>>> task and all tasks related to this project that can be captured on
>>> Phabricator will be captured under here [4].
>>> * I act as the point of contact for this research in the Wikimedia
>>> Foundation. Please feel free to reach out to me (directly, if it cannot
>> be
>>> shared publicly) if you have comments or questions about the project.
>>> 
>>> Best,
>>> Martin
>>> 
>>> [1]
>>> https://www.mediawiki.org/wiki/Wikimedia_Research/Formal_collaborations
>>> [2]
>>> 
>>> 
>> https://meta.wikimedia.org/wiki/Research:Understanding_perception_of_readability_in_Wikipedia

[Wiki-research-l] Re: Best practices for researchers soliciting off-wiki interviews

2023-01-12 Thread Su-Laine Brodsky
Hi Jodi,

In terms of etiquette, it’s totally fine to post these kinds of interview 
requests on WikiProject Talk pages.  I’d expect though that the people who read 
WikiProject talk pages are skewed towards more highly-engaged participants. 
Their habits for assessing scientific and technical information may not be 
representative of Wikipedia contributors overall in the topic area. 

As a frequent contributor to climate change articles, I’ll also mention that if 
you want to know how Wikipedia contributors assess information about climate 
change, I suspect you’ll get different answers if you ask at WikiProject 
Climate Change than if you ask at WikiProject Tropical Cyclones. 

Cheers,
Su-Laine (Wikipedia volunteer)


> On Jan 12, 2023, at 12:34 PM, Isaac Johnson  wrote:
> 
> Hey Jodi -- thanks for asking the question. Some of my thoughts about how
> researchers can solicit off-wiki interviews:
> 
>   - If you have not already created one, I suggest creating a project page
>   on Meta  and linking
>   to it in any posts. This gives interested editors a single page on wiki
>   where they can find relevant information on the project if they're curious.
>   The benefit of Meta in particular is that it also provides a consistent
>   format, has privacy/transparency guarantees, has a place for discussion
>   (talk page), and is discoverable by other researchers.
>   - If the research is extractive in some way (i.e. not just passive data
>   analysis but asking for editor's time as with interviews), you want to make
>   sure it also provides clear benefits for those Wikimedian
>   individuals/communities. When soliciting interviews, it isq quite helpful
>   to communicate these benefits to editors so they can judge whether it's
>   worthwhile to participate.
>   - Your inclination to post on talk pages for topic-specific WikiProjects
>   (collaborative spaces) is spot on. This helps a lot with reducing
>   interview-request spam for editors and if your research leads to actionable
>   findings / tools, then you have a community of folks who know the project
>   and you can hopefully work with to disseminate.
>   - Start small (maybe posting to one group to begin with). This wll help
>   you gather feedback -- e.g., address questions/concerns from editors --
>   before posting in more places.
>   - Also consider looking for local events to attend -- e.g., an
>   edit-a-thon or Wikimedian conference
>   . This is a great way to find
>   editors for interviews in more relaxed spaces and potentially get to
>   observe and ask questions about their editing processes first-hand. For
>   instance, I saw you're at UIUC: maybe the Wikimedians of Chicago User
>   Group
>   
> 
>   has events that could be attended? Sometimes there are nearby
>   Wikimedians-in-Residence
>   
> 
>   who could potentially help you connect with local communities as well.
> 
> Hope that helps and curious to hear thoughts from others.
> 
> Best,
> Isaac
> 
> On Wed, Jan 11, 2023 at 5:42 PM Jodi Schneider  wrote:
> 
>> Hi wiki-research-l folks,
>> 
>> Can the list point me in the right direction about how researchers should
>> solicit off-wiki interviews? I'm seeking to interview editors of English
>> Wikipedia who have provided information about scientific and technical
>> topics. I'm struggling to find up-to-date documentation about expectations
>> for researchers...
>> 
>> Currently the focus is COVID-19; in future years the focus will shift to
>> climate change; and AI and labor. Overall the project seeks to understand
>> how knowledge brokers (including Wikipedia editors) assess the quality of
>> technical and scientific information. This is part of my 3-year, US-based,
>> IRB-approved research study:
>> https://infoqualitylab.org/projects/knowledgebrokers/participate-y1
>> 
>> My inclination (in the absence of specific best practice directions) would
>> be to post a message the Talk pages of the most obvious WikiProjects, with
>> information about the project and how to reach me:
>> WikiProject COVID-19
>> WikiProject Medicine / Pulmonology
>> WikiProject Viruses
>> WikiProject Disaster management
>> Is that appropriate? I'd welcome a pointer to specific requirements or best
>> practices. Offline advice also welcome!
>> 
>> -Jodi
>> User:Jodi.a.schneider
>> jschnei...@pobox.com
>> https://jodischneider.com/jodi.html
>> ___
>> Wiki-research-l mailing list -- wiki-research-l@lists.wikimedia.org
>> To unsubscribe send an email to wiki-research-l-le...@lists.wikimedia.org
>> 
> 
> 
> -- 
> Isaac Johnson (he/him/his) -- Senior Research Scientist -- Wikimedia
> Foundation
> ___
> Wiki-research-l 

Re: [Wiki-research-l] How to quantifying "effort" or "time spent" put into articles?

2020-10-20 Thread Su-Laine Brodsky
Further to Joan’s comment, there are some other ways to stratify edits:

- Whether an edit is vandalism, a vandalism revert, an “actual" change. Vandal 
edits and reverts are both quick compared to good-faith additions and changes. 
Heavily vandalized articles will have long edit histories, even though 
sometimes not much effort was put into them. 

- Whether the edit was made by a human or bot.

- Whether a human edit was made with a tool such as AWB or HotCat. AWB in 
particular can be used to make very fast edits.

Another thought is that if you’re trying to measure contributor effort, why not 
look at article Talk pages as well? For controversial articles, a large 
proportion of editor time is spent on discussion. 

Cheers,
Su-Laine (longtime Wikipedia contributor)


> On Oct 20, 2020, at 12:37 PM, Johan Jönsson  wrote:
> 
> A few comments from an editing perspective, in case anything here is useful:
> 
> I think Levenshtein distance might be a useful concept here, given the
> indication that I've read through and made some sort of decision around a
> whole article or a significant part of an article – both for additions and
> subtractions.
> 
> When it comes to article content, the most important signifier of effort
> spent on an edit beyond text length that comes to mind is whether a new ref
> tag is added. If I'm referencing something, there's a fair chance that I've
> not only identified a shortage or deficiency, but potentially spent time
> both finding a source and reading through it to be able to reference it,
> even if it results in a short sentence.
> 
> In some languages, translations of other Wikipedia articles are common;
> there might be a big difference between adding the same type of content
> translated from another language version and writing it from scratch.
> 
> //Johan Jönsson
> --
> 
> Den tis 20 okt. 2020 kl 20:32 skrev Nate E TeBlunthuis :
> 
>> Greetings!
>> 
>> Quantifying effort is obviously a fraught prospect, but Geiger and
>> Halfaker [1] used edit sessions defined as consecutive edits by an editor
>> without a gap longer than an hour to quantify the total number of labor
>> hours spent on Wikipedia.  I'm familiar with other papers that use this
>> approach to measure things like editor experience.
>> 
>> I'm curious about the amount of effort put into each particular article.
>> Edit sessions seem like a good approach, but there are some problems:
>> 
>>  *   How much time does an edit session of length 1 take?
>>  *   Should article edit sessions be consecutive in the same article?
>>  *   What if someone makes an edit to related article in the middle of
>> their session?
>> 
>> I wonder what folks here think about alternatives for quantifying effort
>> to an article like
>> 
>>  1.  Number of wikitext characters added/removed
>>  2.  Levenshtein (edit) distance (of characters or tokens)
>>  3.  Simply the number of edits
>> 
>> Thanks for your help!
>> 
>> [1] Geiger, R. S., & Halfaker, A. (2013). Using edit sessions to measure
>> participation in Wikipedia. Proceedings of the 2013 Conference on Computer
>> Supported Cooperative Work, 861–870.
>> http://dl.acm.org/citation.cfm?id=2441873
>> 
> ___
> Wiki-research-l mailing list
> Wiki-research-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


Re: [Wiki-research-l] Editor surveys on race/ethnicity/religion

2020-09-21 Thread Su-Laine Brodsky
 places where there is a larger
>> editor
>>>>> presence and local laws and norms allow such questions. We have not
>> yet
>>>>> discussed asking about religion in the Community Insights survey.
>>>>> 
>>>>> On Mon, Sep 21, 2020 at 9:20 AM Isaac Johnson 
>>>> wrote:
>>>>> 
>>>>>> As pointed out by others, the highly contextualized nature of
>> religion,
>>>>>> race, and ethnicity between countries makes it very difficult to
>>>> impossible
>>>>>> to craft questions that are not overly reductive but still somewhat
>>>>>> universal. Despite this challenge, understanding diversity in a way
>>>> that
>>>>>> captures these aspects is obviously quite important as they often
>>>> figure
>>>>>> very strongly into power and representation within history, media,
>> etc.
>>>>>> 
>>>>>> In general, if you're looking for large-scale surveys of editors,
>> the
>>>> Meta
>>>>>> category (Category:Editor surveys
>>>>>> <https://meta.wikimedia.org/wiki/Category:Editor_surveys>) is
>> actually
>>>>>> quite complete (same for readers
>>>>>> <https://meta.wikimedia.org/wiki/Category:Reader_surveys>). In
>>>>>> particular, I wrote what little I could find about these topics
>> into
>>>> this
>>>>>> section of our recently published knowledge gaps taxonomy:
>>>>>> https://arxiv.org/pdf/2008.12314.pdf#subsubsection.3.1.7
>>>>>> 
>>>>>> The April 2011 editor survey took the approach of just asking
>> people
>>>> how
>>>>>> they felt they were different from others in the community -- this
>>>> specific
>>>>>> question is not one that I would advocate today (asking people to
>>>> identify
>>>>>> all the ways in which they may be "outsiders" is not particularly
>>>>>> welcoming) but this is also probably the style of approach (asking
>>>> people
>>>>>> how well they feel represented within Wikipedia content or editor
>>>>>> community) that you'd have to take to get information on ethnicity
>> /
>>>> race /
>>>>>> religion without writing country-specific questions:
>>>>>> 
>>>> 
>> https://upload.wikimedia.org/wikipedia/commons/7/76/Editor_Survey_Report_-_April_2011.pdf#page=65
>>>>>> 
>>>>>> On Mon, Sep 21, 2020 at 6:12 AM Stuart A. Yeates <
>> syea...@gmail.com>
>>>>>> wrote:
>>>>>> 
>>>>>>> The ethnicity / race question is an incredibly hard question to
>>>>>>> compose in an internationalised way.
>>>>>>> 
>>>>>>> Pretty much every country in the world uses different terms and
>> there
>>>>>>> are some very confusing cases where the same term is used in
>> different
>>>>>>> countries to mean very different things (e,g, "Asian" in UK
>> English vs
>>>>>>> New Zealand English). This is derived from varying legal
>> definitions
>>>>>>> (for example blood quantum vs one-drop laws); the history of
>>>>>>> colonisation and waves of immigration to the country; along with
>>>>>>> cultural differences.
>>>>>>> 
>>>>>>> cheers
>>>>>>> stuart
>>>>>>> --
>>>>>>> ...let us be heard from red core to black sky
>>>>>>> 
>>>>>>> On Mon, 21 Sep 2020 at 21:55, Federico Leva (Nemo) <
>>>> nemow...@gmail.com>
>>>>>>> wrote:
>>>>>>>> 
>>>>>>>> Su-Laine Brodsky, 21/09/20 08:19:
>>>>>>>>> I’m wondering if any large-scale surveys have been done that
>> ask
>>>>>>> Wikipedia editors about their race, ethnicity, or religion?
>>>>>>>> 
>>>>>>>> What international standards exist to phrase such questions?
>>>>>>>> Denominations commonly used in surveys in one country may be
>>>> considered
>>>>>>>> horrific or even illegal in others.
>>>>>>>> 
>>>>>>>> I see OECD cons

[Wiki-research-l] Editor surveys on race/ethnicity/religion

2020-09-20 Thread Su-Laine Brodsky
Hi everyone,

I’m wondering if any large-scale surveys have been done that ask Wikipedia 
editors about their race, ethnicity, or religion? 
Also, have any researchers considered asking these questions in editor surveys, 
but chosen not to ask them for particular reasons?

Best wishes,
Su-Laine

Su-Laine Brodsky
Vancouver, BC
___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


Re: [Wiki-research-l] Feedback about Wikipedia-related project.

2020-09-19 Thread Su-Laine Brodsky
Hi,

A good place to get feedback from the English Wikipedia community would be the 
Village Pump Idea Lab: 
https://en.wikipedia.org/wiki/Wikipedia:Village_pump_(idea_lab) . 

It’s not clear to me whether the tool would be suggesting inline links for the 
text that’s already in the article, or “See also” links. A description of what 
problem the tool would solve would be really helpful.  It would also be helpful 
to see “before and after” mockups showing a specific stub article as it exists 
today and what the article would look like after the tool’s suggestions have 
been applied. 

Cheers,
Su-Laine
Wikipedia contributor


> On Sep 18, 2020, at 3:30 AM, Garcia Duran Alberto  
> wrote:
> 
> Hi all
> 
> We are researchers from the dlab at EPFL working with Bob West.
> 
> We have plans to build a graph-based ML algorithm, which will further 
> facilitate development of a tool to assist Wikipedia editors by providing 
> recommendations on two novel use-cases. One consists of suggesting hyperlinks 
> (Wikipedia articles) to be inserted within a section of an article. Note that 
> this is different from "classical link prediction".
> 
> We feel the tool could be of great value, as it can work with newly created 
> sections that do not have any content yet. What's more, the editor can type 
> *any* section name (either non-existent in that article or even in the whole 
> Wiki project) and the tool would have the power to suggest hyperlinks that 
> are likely to be of interest for that section in the article. We think that 
> (specially) stub articles can benefit from this tool.
> 
> However, we have one assumption. In addition to the section name, the editor 
> must provide the "entity type" (Place, People, Date, Organization...) of the 
> Wikipedia articles she would like to insert in the section. The reason is 
> that within a section you can find links to articles of diverse types.
> 
> The reason we are reaching out to you is two fold:
> (1) To check whether such a tool would be of interest and likely to be used 
> by the editors.
> (2) How limiting is the assumption that the editor needs to specify the 
> entity type of the Wikipedia articles for which she needs recommendations 
> from the tool?
> 
> One one hand, some of us think this is not a problem as the number of entity 
> types is relatively small (between 10 and 20) and they can be easily and 
> visually presented to the editor with a dropdown list. On the other side, 
> others think this requirement is limiting.
> 
> We would like to know your opinion to decide whether we should move forward 
> with this project.
> 
> Thanks!
> dlab
> 
> ___
> Wiki-research-l mailing list
> Wiki-research-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


Re: [Wiki-research-l] Number of registered editors per country

2020-08-23 Thread Su-Laine Brodsky
Hi Thomas,

This isn’t quite what you asked for, but the editor survey from 2018 might be 
helpful: 
https://meta.wikimedia.org/wiki/Community_Insights/2018_Report#Looking_at_diversity_across_community_audiences
 


The survey responses will be skewed towards more active editors, but depending 
on how you want to use the information that might be OK. 

Best wishes,
Su-Laine


> On Aug 22, 2020, at 1:59 PM, Thomas Stieve  
> wrote:
> 
> Dear all,
> 
> Hope all is well. Does anyone know of published statistics for the number
> of registered editors per country? I need it for 2016, but any year close
> to that would suffice.
> 
> Your help is greatly appreciated,
> Tom
> 
> -- 
> Thomas Stieve
> Ph.D. Candidate
> School of Geography and Development
> University of Arizona
> ___
> Wiki-research-l mailing list
> Wiki-research-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l

___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


Re: [Wiki-research-l] Statistics on reverted edits

2020-02-06 Thread Su-Laine Brodsky
Hi everyone,

Many thanks for the responses so far. I’m going through the links that Tilman 
and Isaac provided. 

Here is some more background on what I’m trying to accomplish (I’m realizing 
that more background usually helps). I have two projects going on: One is that 
later this month I’ll be doing a short presentation at the Misinfocon 
conference, as part of a panel discussion on quality at Wikipedia. The other 
project is that I’m writing a book for a general audience about how the English 
Wikipedia works in its processes and culture. I’ll be happy to talk more about 
this offline if anyone is interested.

Both of these projects are very general in scope so I’m trying to rely on 
existing research as much as possible rather than conducting new primary 
research. I’d like to give a sense of *approximately* how many good, bad, and 
controversial edits the English Wikipedia gets. I’m not looking for perfect 
metrics, just ones that I can explain. E.g. the percentage of edits that 
machines can classify as being reverted is one possible metric of how many 
edits are considered to be bad by someone. I can explain that this might 
undercount the actual figure because humans might partially revert or fully 
revert an edit in a way that’s not machine-detectable. 

I found the answer to my question #5 through a Quarry query (I love that 
site!). In 2019, edit filters disallowed 581,120 attempted edits to the English 
Wikipedia, which is around one disallow per minute and totals nearly 1% of all 
enwiki edits. If we assume all disallowed edits are vandalism, and 2.5% of 
successful edits are vandalism, then around 3.5% of all attempted edits are 
vandalism and 29% of these attempts are disallowed by edit filters.

Cheers,
Su-Laine










> On Feb 4, 2020, at 1:47 PM, Ziko van Dijk  wrote:
> 
> Hello Sue-Laine,
> 
> Interesting, I am very much looking forward to your results/paper.
> 
> Allow me a note on „reverts“. I am not sure which is the exact metholody
> you want to use, and what is your approach / field in general. It comes to
> my mind that a good definition of revert is needed. Technically, a revert
> means that you re-install a previous page version (I guess). But sometimes,
> also in the technical dimension, this is done by the „revert“ function (or
> the revert function that enables a comment), and sometimes „manually“ by
> creating a new version with old content.
> 
> Sometimes, the revert is a full revert, sometimes a partial revert.
> Sometimes, the old version is text A, the new version is text B, and then
> the „revert“ actually is a version with text A‘ or B‘ or C (the apostroph
> in my writing means: similar to).
> 
> Also, what about reverting yourself? With what motive exactly?
> 
> If I am correct you have mentioned some examples dealing with the reason
> for deletion. That is an important approach too, of course. It would be
> another step to consider the consequences of a revert in the social
> dimension. So how does a revert afflict the social relationship between the
> editors involved. And how is the general atmosphere on the wiki afflicted.
> 
> Here some thought, maybe useful or not. :-)
> 
> Kind regards
> Ziko
> 
> 
> 
> 
> Tilman Bayer  schrieb am Sa. 1. Feb. 2020 um 03:25:
> 
>> Concerning 1) and about analyzing reverts in general, see
>> https://meta.wikimedia.org/wiki/Research:Revert .
>> 
>> To explore 5), https://meta.wikimedia.org/wiki/AbuseFilter and
>> https://tools.wmflabs.org/ptwikis/Filters:enwiki may be of interest.
>> 
>> Regards, HaeB
>> 
>> On Wed, Jan 29, 2020 at 12:01 PM Su-Laine Brodsky 
>> wrote:
>> 
>>> Hi everyone,
>>> 
>>> I’m looking for statistics about the edits that are reverted on the
>>> English Wikipedia. This is for purposes of explaining to the public what
>>> Wikipedia’s quality control processes are like. If hard numbers aren’t
>>> available, I’m also interested in educated guesstimates.
>>> 
>>> 1) An often-quoted statistic is that 7% of edits are reverted. Is this
>>> still believed to be true?
>>> 
>>> 2) According to
>>> https://blog.wikimedia.org/2017/07/19/scoring-platform-team/, 2.5% of
>>> edits are vandalism. There are other common reasons for reverting, and
>> I’m
>>> wondering if anyone has studied their frequency. Does anyone know what
>>> percentage of all edits are reverted for being:
>>> a) Spam (as perceived by the reverter)
>>> b) Copyright violation
>>> c) Violations of the Biographies of Living Persons policy
>>> 
>>> 3) Do statistics on the number of edits per day on the English Wikipedia
>>> (i.e. 164,000 edits per day) include edits that are blocke

[Wiki-research-l] Statistics on reverted edits

2020-01-29 Thread Su-Laine Brodsky
Hi everyone,

I’m looking for statistics about the edits that are reverted on the English 
Wikipedia. This is for purposes of explaining to the public what Wikipedia’s 
quality control processes are like. If hard numbers aren’t available, I’m also 
interested in educated guesstimates.

1) An often-quoted statistic is that 7% of edits are reverted. Is this still 
believed to be true?

2) According to https://blog.wikimedia.org/2017/07/19/scoring-platform-team/, 
2.5% of edits are vandalism. There are other common reasons for reverting, and 
I’m wondering if anyone has studied their frequency. Does anyone know what 
percentage of all edits are reverted for being:
a) Spam (as perceived by the reverter)
b) Copyright violation
c) Violations of the Biographies of Living Persons policy

3) Do statistics on the number of edits per day on the English Wikipedia (i.e. 
164,000 edits per day) include edits that are blocked by the spam blacklists or 
by edit filters? 

4) How many edits per day on the English Wikiepdia are prevented (blocked) by 
the spam blacklists? 

5) How many edits per day on the English Wikiepdia are prevented by the edit 
filters? 

6) What percentage of all reverts are made by users of Huggle and Stiki? 

7) What proportion of vandalism is quickly reverted? A 2007 study (Priedhorsky 
et al) found that 42% of vandalistic contributions are repaired within one view 
and 70% within ten views - have any newer studies been done on this? 
 
Thanks in advance!

Su-Laine
Vancouver, BC


___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


Re: [Wiki-research-l] Question on article creation policy

2019-08-13 Thread Su-Laine Brodsky
Hi Haifeng,

The history of the following page might indicate when policies changed around 
the access levels required to create articles: 
https://en.wikipedia.org/wiki/Wikipedia:User_access_levels . 


Cheers,
Su-Laine


> On Aug 11, 2019, at 9:25 AM, Haifeng Zhang  wrote:
> 
> Thanks a lot for providing all these information!
> 
> Was there a major change in article creation policy in early 2007?
> 
> Can anonymous users create new pages before then?
> 
> 
> Best,
> 
> Haifeng Zhang
> 
> From: Wiki-research-l  on behalf 
> of Su-Laine Brodsky 
> Sent: Saturday, August 10, 2019 2:44:24 AM
> To: Research into Wikimedia content and communities
> Subject: Re: [Wiki-research-l] Question on article creation policy
> 
> Hi Haifeng,
> 
> Re :  A more general question is: where to find information about policy 
> changes, e.g., article creation, in Wikipedia?
> 
> The Wikipedia Signpost usually covers major policy changes like this one 
> (https://en.m.wikipedia.org/wiki/Wikipedia:Wikipedia_Signpost)
> 
> As Kerry pointed out though, more subtle policy changes happen without much 
> publicity. If changes are contentious enough, they might appear in an RfC or 
> The Village Pump, so those are some other areas to look.
> 
> Cheers,
> Su-Laine
> 
> Sent from my iPhone
> 
>> On Aug 9, 2019, at 11:48 AM, Haifeng Zhang  wrote:
>> 
>> Dear folks,
>> 
>> I'm checking the Article Creation page 
>> (https://en.wikipedia.org/wiki/Wikipedia:Article_creation), and it says:
>> 
>> 
>> The ability to create articles directly in mainspace is 
>> restricted<https://en.wikipedia.org/wiki/Wikipedia:ACPERM> to autoconfirmed 
>> users, though non-confirmed users and non-registered users can submit a 
>> proposed article through the Articles for 
>> Creation<https://en.wikipedia.org/wiki/Wikipedia:Articles_for_creation> 
>> process, where it will be reviewed and considered for publication.
>> 
>> 
>> Anyone knows when the restriction (e.g., registered and auto-confirmed) 
>> become effective? I tracked the past revisions of the page but found no 
>> clue. A more general question is: where to find information about policy 
>> changes, e.g., article creation, in Wikipedia?
>> 
>> 
>> Thanks,
>> 
>> Haifeng
>> 
>> ___
>> Wiki-research-l mailing list
>> Wiki-research-l@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
> ___
> Wiki-research-l mailing list
> Wiki-research-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
> ___
> Wiki-research-l mailing list
> Wiki-research-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


Re: [Wiki-research-l] Question on article creation policy

2019-08-10 Thread Su-Laine Brodsky
Hi Haifeng,

Re :  A more general question is: where to find information about policy 
changes, e.g., article creation, in Wikipedia?

The Wikipedia Signpost usually covers major policy changes like this one 
(https://en.m.wikipedia.org/wiki/Wikipedia:Wikipedia_Signpost)

As Kerry pointed out though, more subtle policy changes happen without much 
publicity. If changes are contentious enough, they might appear in an RfC or 
The Village Pump, so those are some other areas to look.

Cheers,
Su-Laine

Sent from my iPhone

> On Aug 9, 2019, at 11:48 AM, Haifeng Zhang  wrote:
> 
> Dear folks,
> 
> I'm checking the Article Creation page 
> (https://en.wikipedia.org/wiki/Wikipedia:Article_creation), and it says:
> 
> 
> The ability to create articles directly in mainspace is 
> restricted to autoconfirmed 
> users, though non-confirmed users and non-registered users can submit a 
> proposed article through the Articles for 
> Creation 
> process, where it will be reviewed and considered for publication.
> 
> 
> Anyone knows when the restriction (e.g., registered and auto-confirmed) 
> become effective? I tracked the past revisions of the page but found no clue. 
> A more general question is: where to find information about policy changes, 
> e.g., article creation, in Wikipedia?
> 
> 
> Thanks,
> 
> Haifeng
> 
> ___
> Wiki-research-l mailing list
> Wiki-research-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l