Re: [Wiki-research-l] Statistics on reverted edits

2020-02-06 Thread Su-Laine Brodsky
Hi everyone,

Many thanks for the responses so far. I’m going through the links that Tilman 
and Isaac provided. 

Here is some more background on what I’m trying to accomplish (I’m realizing 
that more background usually helps). I have two projects going on: One is that 
later this month I’ll be doing a short presentation at the Misinfocon 
conference, as part of a panel discussion on quality at Wikipedia. The other 
project is that I’m writing a book for a general audience about how the English 
Wikipedia works in its processes and culture. I’ll be happy to talk more about 
this offline if anyone is interested.

Both of these projects are very general in scope so I’m trying to rely on 
existing research as much as possible rather than conducting new primary 
research. I’d like to give a sense of *approximately* how many good, bad, and 
controversial edits the English Wikipedia gets. I’m not looking for perfect 
metrics, just ones that I can explain. E.g. the percentage of edits that 
machines can classify as being reverted is one possible metric of how many 
edits are considered to be bad by someone. I can explain that this might 
undercount the actual figure because humans might partially revert or fully 
revert an edit in a way that’s not machine-detectable. 

I found the answer to my question #5 through a Quarry query (I love that 
site!). In 2019, edit filters disallowed 581,120 attempted edits to the English 
Wikipedia, which is around one disallow per minute and totals nearly 1% of all 
enwiki edits. If we assume all disallowed edits are vandalism, and 2.5% of 
successful edits are vandalism, then around 3.5% of all attempted edits are 
vandalism and 29% of these attempts are disallowed by edit filters.

Cheers,
Su-Laine










> On Feb 4, 2020, at 1:47 PM, Ziko van Dijk  wrote:
> 
> Hello Sue-Laine,
> 
> Interesting, I am very much looking forward to your results/paper.
> 
> Allow me a note on „reverts“. I am not sure which is the exact metholody
> you want to use, and what is your approach / field in general. It comes to
> my mind that a good definition of revert is needed. Technically, a revert
> means that you re-install a previous page version (I guess). But sometimes,
> also in the technical dimension, this is done by the „revert“ function (or
> the revert function that enables a comment), and sometimes „manually“ by
> creating a new version with old content.
> 
> Sometimes, the revert is a full revert, sometimes a partial revert.
> Sometimes, the old version is text A, the new version is text B, and then
> the „revert“ actually is a version with text A‘ or B‘ or C (the apostroph
> in my writing means: similar to).
> 
> Also, what about reverting yourself? With what motive exactly?
> 
> If I am correct you have mentioned some examples dealing with the reason
> for deletion. That is an important approach too, of course. It would be
> another step to consider the consequences of a revert in the social
> dimension. So how does a revert afflict the social relationship between the
> editors involved. And how is the general atmosphere on the wiki afflicted.
> 
> Here some thought, maybe useful or not. :-)
> 
> Kind regards
> Ziko
> 
> 
> 
> 
> Tilman Bayer  schrieb am Sa. 1. Feb. 2020 um 03:25:
> 
>> Concerning 1) and about analyzing reverts in general, see
>> https://meta.wikimedia.org/wiki/Research:Revert .
>> 
>> To explore 5), https://meta.wikimedia.org/wiki/AbuseFilter and
>> https://tools.wmflabs.org/ptwikis/Filters:enwiki may be of interest.
>> 
>> Regards, HaeB
>> 
>> On Wed, Jan 29, 2020 at 12:01 PM Su-Laine Brodsky 
>> wrote:
>> 
>>> Hi everyone,
>>> 
>>> I’m looking for statistics about the edits that are reverted on the
>>> English Wikipedia. This is for purposes of explaining to the public what
>>> Wikipedia’s quality control processes are like. If hard numbers aren’t
>>> available, I’m also interested in educated guesstimates.
>>> 
>>> 1) An often-quoted statistic is that 7% of edits are reverted. Is this
>>> still believed to be true?
>>> 
>>> 2) According to
>>> https://blog.wikimedia.org/2017/07/19/scoring-platform-team/, 2.5% of
>>> edits are vandalism. There are other common reasons for reverting, and
>> I’m
>>> wondering if anyone has studied their frequency. Does anyone know what
>>> percentage of all edits are reverted for being:
>>> a) Spam (as perceived by the reverter)
>>> b) Copyright violation
>>> c) Violations of the Biographies of Living Persons policy
>>> 
>>> 3) Do statistics on the number of edits per day on the English Wikipedia
>>> (i.e. 164,000 edits per day) include edits that are blocked by the spam
>>> blacklists or by edit filters?
>>> 
>>> 4) How many edits per day on the English Wikiepdia are prevented
>> (blocked)
>>> by the spam blacklists?
>>> 
>>> 5) How many edits per day on the English Wikiepdia are prevented by the
>>> edit filters?
>>> 
>>> 6) What percentage of all reverts are made by users of Huggle and Stiki?
>>> 
>>> 7) What proportion of 

Re: [Wiki-research-l] Statistics on reverted edits

2020-02-04 Thread Ziko van Dijk
Hello Sue-Laine,

Interesting, I am very much looking forward to your results/paper.

Allow me a note on „reverts“. I am not sure which is the exact metholody
you want to use, and what is your approach / field in general. It comes to
my mind that a good definition of revert is needed. Technically, a revert
means that you re-install a previous page version (I guess). But sometimes,
also in the technical dimension, this is done by the „revert“ function (or
the revert function that enables a comment), and sometimes „manually“ by
creating a new version with old content.

Sometimes, the revert is a full revert, sometimes a partial revert.
Sometimes, the old version is text A, the new version is text B, and then
the „revert“ actually is a version with text A‘ or B‘ or C (the apostroph
in my writing means: similar to).

Also, what about reverting yourself? With what motive exactly?

If I am correct you have mentioned some examples dealing with the reason
for deletion. That is an important approach too, of course. It would be
another step to consider the consequences of a revert in the social
dimension. So how does a revert afflict the social relationship between the
editors involved. And how is the general atmosphere on the wiki afflicted.

Here some thought, maybe useful or not. :-)

Kind regards
Ziko




Tilman Bayer  schrieb am Sa. 1. Feb. 2020 um 03:25:

> Concerning 1) and about analyzing reverts in general, see
> https://meta.wikimedia.org/wiki/Research:Revert .
>
> To explore 5), https://meta.wikimedia.org/wiki/AbuseFilter and
> https://tools.wmflabs.org/ptwikis/Filters:enwiki may be of interest.
>
> Regards, HaeB
>
> On Wed, Jan 29, 2020 at 12:01 PM Su-Laine Brodsky 
> wrote:
>
> > Hi everyone,
> >
> > I’m looking for statistics about the edits that are reverted on the
> > English Wikipedia. This is for purposes of explaining to the public what
> > Wikipedia’s quality control processes are like. If hard numbers aren’t
> > available, I’m also interested in educated guesstimates.
> >
> > 1) An often-quoted statistic is that 7% of edits are reverted. Is this
> > still believed to be true?
> >
> > 2) According to
> > https://blog.wikimedia.org/2017/07/19/scoring-platform-team/, 2.5% of
> > edits are vandalism. There are other common reasons for reverting, and
> I’m
> > wondering if anyone has studied their frequency. Does anyone know what
> > percentage of all edits are reverted for being:
> > a) Spam (as perceived by the reverter)
> > b) Copyright violation
> > c) Violations of the Biographies of Living Persons policy
> >
> > 3) Do statistics on the number of edits per day on the English Wikipedia
> > (i.e. 164,000 edits per day) include edits that are blocked by the spam
> > blacklists or by edit filters?
> >
> > 4) How many edits per day on the English Wikiepdia are prevented
> (blocked)
> > by the spam blacklists?
> >
> > 5) How many edits per day on the English Wikiepdia are prevented by the
> > edit filters?
> >
> > 6) What percentage of all reverts are made by users of Huggle and Stiki?
> >
> > 7) What proportion of vandalism is quickly reverted? A 2007 study
> > (Priedhorsky et al) found that 42% of vandalistic contributions are
> > repaired within one view and 70% within ten views - have any newer
> studies
> > been done on this?
> >
> > Thanks in advance!
> >
> > Su-Laine
> > Vancouver, BC
> >
> >
> > ___
> > Wiki-research-l mailing list
> > Wiki-research-l@lists.wikimedia.org
> > https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
> >
> ___
> Wiki-research-l mailing list
> Wiki-research-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
>
___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


Re: [Wiki-research-l] Statistics on reverted edits

2020-02-04 Thread Isaac Johnson
+Analytics who might be able to help with how reverts / Abuse Filter / etc.
figure into edit counts

In addition to the links from HaeB, I would also suggest reading the recent
report on content moderation on wikis, which on top of interviews has some
quantitative analyses and additional methods for understanding reverts:
https://meta.wikimedia.org/wiki/Research:Understanding_content_moderation_on_English_Wikipedia

Best,
Isaac

On Fri, Jan 31, 2020 at 8:25 PM Tilman Bayer  wrote:

> Concerning 1) and about analyzing reverts in general, see
> https://meta.wikimedia.org/wiki/Research:Revert .
>
> To explore 5), https://meta.wikimedia.org/wiki/AbuseFilter and
> https://tools.wmflabs.org/ptwikis/Filters:enwiki may be of interest.
>
> Regards, HaeB
>
> On Wed, Jan 29, 2020 at 12:01 PM Su-Laine Brodsky 
> wrote:
>
> > Hi everyone,
> >
> > I’m looking for statistics about the edits that are reverted on the
> > English Wikipedia. This is for purposes of explaining to the public what
> > Wikipedia’s quality control processes are like. If hard numbers aren’t
> > available, I’m also interested in educated guesstimates.
> >
> > 1) An often-quoted statistic is that 7% of edits are reverted. Is this
> > still believed to be true?
> >
> > 2) According to
> > https://blog.wikimedia.org/2017/07/19/scoring-platform-team/, 2.5% of
> > edits are vandalism. There are other common reasons for reverting, and
> I’m
> > wondering if anyone has studied their frequency. Does anyone know what
> > percentage of all edits are reverted for being:
> > a) Spam (as perceived by the reverter)
> > b) Copyright violation
> > c) Violations of the Biographies of Living Persons policy
> >
> > 3) Do statistics on the number of edits per day on the English Wikipedia
> > (i.e. 164,000 edits per day) include edits that are blocked by the spam
> > blacklists or by edit filters?
> >
> > 4) How many edits per day on the English Wikiepdia are prevented
> (blocked)
> > by the spam blacklists?
> >
> > 5) How many edits per day on the English Wikiepdia are prevented by the
> > edit filters?
> >
> > 6) What percentage of all reverts are made by users of Huggle and Stiki?
> >
> > 7) What proportion of vandalism is quickly reverted? A 2007 study
> > (Priedhorsky et al) found that 42% of vandalistic contributions are
> > repaired within one view and 70% within ten views - have any newer
> studies
> > been done on this?
> >
> > Thanks in advance!
> >
> > Su-Laine
> > Vancouver, BC
> >
> >
> > ___
> > Wiki-research-l mailing list
> > Wiki-research-l@lists.wikimedia.org
> > https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
> >
> ___
> Wiki-research-l mailing list
> Wiki-research-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
>


-- 
Isaac Johnson (he/him/his) -- Research Scientist -- Wikimedia Foundation
___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


Re: [Wiki-research-l] Statistics on reverted edits

2020-01-31 Thread Tilman Bayer
Concerning 1) and about analyzing reverts in general, see
https://meta.wikimedia.org/wiki/Research:Revert .

To explore 5), https://meta.wikimedia.org/wiki/AbuseFilter and
https://tools.wmflabs.org/ptwikis/Filters:enwiki may be of interest.

Regards, HaeB

On Wed, Jan 29, 2020 at 12:01 PM Su-Laine Brodsky 
wrote:

> Hi everyone,
>
> I’m looking for statistics about the edits that are reverted on the
> English Wikipedia. This is for purposes of explaining to the public what
> Wikipedia’s quality control processes are like. If hard numbers aren’t
> available, I’m also interested in educated guesstimates.
>
> 1) An often-quoted statistic is that 7% of edits are reverted. Is this
> still believed to be true?
>
> 2) According to
> https://blog.wikimedia.org/2017/07/19/scoring-platform-team/, 2.5% of
> edits are vandalism. There are other common reasons for reverting, and I’m
> wondering if anyone has studied their frequency. Does anyone know what
> percentage of all edits are reverted for being:
> a) Spam (as perceived by the reverter)
> b) Copyright violation
> c) Violations of the Biographies of Living Persons policy
>
> 3) Do statistics on the number of edits per day on the English Wikipedia
> (i.e. 164,000 edits per day) include edits that are blocked by the spam
> blacklists or by edit filters?
>
> 4) How many edits per day on the English Wikiepdia are prevented (blocked)
> by the spam blacklists?
>
> 5) How many edits per day on the English Wikiepdia are prevented by the
> edit filters?
>
> 6) What percentage of all reverts are made by users of Huggle and Stiki?
>
> 7) What proportion of vandalism is quickly reverted? A 2007 study
> (Priedhorsky et al) found that 42% of vandalistic contributions are
> repaired within one view and 70% within ten views - have any newer studies
> been done on this?
>
> Thanks in advance!
>
> Su-Laine
> Vancouver, BC
>
>
> ___
> Wiki-research-l mailing list
> Wiki-research-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
>
___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l