Re: [Foundation-l] Frequency of Seeing Bad Versions - now with traffic data

2009-08-28 Thread Anthony
On Fri, Aug 28, 2009 at 12:43 AM, Brion Vibber br...@wikimedia.org wrote:

 On 8/27/09 9:39 PM, Thomas Dalton wrote:
  2009/8/28 Gregory Maxwellgmaxw...@gmail.com:
  If the results of this kind of study have good agreement with
  mechanical proxy metrics (such as machine detected vandalism) our
  confidence in those proxies will increase, if they disagree it will
  provide an opportunity to improve the proxies.
 
  This kind of intensive study on a few small sample with a more
  automated method used on the same sample to compare would be more
  achievable. If the automated method gets similar results, we can use
  that method for larger samples.

 I would certainly be interested in seeing such a result.


Can you get us 5000 random article views from the http log made during the
first half of 2009?  All we need is URL/date/time.  Everything else can be
blanked for anonymizing.  It can be from a 1/10th log or whatever.  The list
should consist solely of *views*, not edits, and only of articles.

All the rest of the data is out there, unless we happen to hit on a
deleted/oversighted revision.  But using http://dammit.lt/wikistats/ to
estimate the hits is less accurate.  Many popular pages get popular
suddenly, and then quickly fade away.  There is most likely a strong
correlation to the amount of vandalism that takes place while they are
popular to the amount of vandalism that takes place while they are not
popular, so I'd much prefer a sample from the actual http log.

If we can't get the real thing, I'll start downloading from
http://dammit.lt/wikistats/ and generate an estimated one, though.

Once we have the list, anyone is free to examine it any way they want, and
show their results.  But we're talking about probably less than 200
instances of vandalism here, so it'll be quite easy (and fun) to lambaste
anyone whose methods produce false positives.

If you're going to do it, maybe we should work on a rough-consensus
objective definition of vandalism before you release the file, though...
___
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


Re: [Foundation-l] Frequency of Seeing Bad Versions - now with traffic data

2009-08-28 Thread Anthony
On Fri, Aug 28, 2009 at 10:08 AM, Thomas Dalton thomas.dal...@gmail.comwrote:

 2009/8/28 Anthony wikim...@inbox.org:
  If you're going to do it, maybe we should work on a rough-consensus
  objective definition of vandalism before you release the file,
 though...

 Don't we have a consensus definition already? Vandalism is bad faith
 editing. You may also want to include test edits since they are
 treated in the same way (just with different warning messages). That
 isn't objective, but it should be close enough. We can argue over a
 few borderline cases.


Well, it relies on information (intent) which we can't determine simply from
the content of the edit (sometimes it is implied if you look at the entire
behavior of the user, but that's too messy).  Is a POV edit vandalism?  I
think it has to be treated as such, at least some of the time (Windows is
the worst operating system ever), but there are certainly edits which are
clearly POV but the intent is unclear (many people don't know the rules).
We need to remove intent from the definition, and I suppose call it
degraded articles.  But simply saying that anything POV is vandalism would
potentially include just about any large article.

I suppose we can just list everything that's arguably vandalism and then
categorize it later though.  I expect we'll come up with several different
final numbers, which I guess is okay (the only part that really needs to be
pristinely unbiased is the selection of pageviews), though I do expect some
people will adapt their definition of vandalism to fit the data.

I support the request for 5000 random pageviews (uniform distribution
 by pageview over the last 6 months) from the logs.


Seems like it could be reused for a lot of different types of studies, so
long as the researcher isn't exposed to the details of the urls before
coming up with his/her methodology.  And I think the analysis of those 5000
pageviews in all sorts of ways would crowdsource well.  I'd love to see a
Nature Study equivalent, analyzing the more subjective aspects of the
articles in addition to just plain old vandalized/not-vandalized.

If we can't get the 5000 random pageviews (do the logs even still exist?), I
suppose wikistats will do.  They have pageviews broken down by hour, so the
non-uniformity of a single hour is probably fairly small for the popular
pages most likely to be selected.  Worst part is that it's a whole lot of
data to download, and I'm not sure any shortcuts can be taken without
screwing up the non-uniformity.  I considered just downloading the
projectcounts and then selecting the date-hours weighted accordingly then
downloading only the date-hour files needed, but that does potentially
introduce error if the non-article traffic isn't well correlated to the
article traffic, so I dunno.  Probably a safe assumption that they are well
correlated, but I'd rather not guess.  Maybe talk-page traffic is highly
correlated to increased vandalism, or decreased vandalism.  It's possible,
so I'd rather be safe.
___
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


Re: [Foundation-l] Frequency of Seeing Bad Versions - now with traffic data

2009-08-28 Thread Robert Rohde
On Thu, Aug 27, 2009 at 9:43 PM, Brion Vibberbr...@wikimedia.org wrote:
snip

 Robert, is it possible to share the source for generating the
 revert-based stats with other folks who may be interested in pursuing
 further work on the subject? Thanks!

Not as a complete stand-alone entity.  The analysis framework I
through together for this has closed-source dependencies.  I may help
with partial code or pseudocode though.

-Robert

___
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


Re: [Foundation-l] Frequency of Seeing Bad Versions - now with traffic data

2009-08-28 Thread Robert Rohde
On Fri, Aug 28, 2009 at 3:55 AM, Anthonywikim...@inbox.org wrote:
snip

 Once we have the list, anyone is free to examine it any way they want, and
 show their results.  But we're talking about probably less than 200
 instances of vandalism here, so it'll be quite easy (and fun) to lambaste
 anyone whose methods produce false positives.

Comments like this discourage people like me from putting in the time
and effort to do this sort of work.  Offering constructive criticism
is one thing, but looking forward to the fun of lambast[ing] the
good faith efforts of others is offensive and not in keeping with the
collaborative spirit necessary to run WMF projects.

-Robert Rohde

___
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


Re: [Foundation-l] Frequency of Seeing Bad Versions - now with traffic data

2009-08-28 Thread Lars Aronsson
Anthony wrote:

 Umm...you would count the number of instances of vandalism?
 
 Is the question how to objectively *define* vandalism?

On one hand, we have a perception, as expressed by media (and by 
CEO Sue Gardner, I believe), that vandalism (especially of 
biographies of living people, BLP) is an increasing problem.  On 
the other hand, we have the habit of always asking for proofs and 
measurements: Citation needed!

We can try to find out which edits are reverts, assuming that the 
previous edit was an act of vandalism.  That way we can conclude 
which articles were vandalized and how long it took to revert 
them.  Add to that: How many readers viewed the vandalized 
version?  Vandalism is harmless if nobody watches it. It is mostly 
harmless if it is obvious and childish (e.g. Barack Obama was born 
on Mars, he's a space alien).  When it does harm (and becomes a 
problem, allegedly an increasing problem) is when it is viewed and 
taken for the truth (e.g. a statement that Barack Obama was not 
born in the U.S. and thus would not be a legitimate president).

Especially, it becomes a very real problem if the biographed 
living person takes offense and takes legal action against the 
WMF.  Now, that's very easy to measure: How much money did WMF 
need to spend, month by month, to resolve such conflicts, 
including time to explain the process to media?  That is money 
that could be used to buy servers instead.  A more efficient BLP 
policy might render the WMF more money for servers. Very real. 
Now, we only need to insert real numbers into this equation.


-- 
  Lars Aronsson (l...@aronsson.se)
  Aronsson Datateknik - http://aronsson.se

___
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


Re: [Foundation-l] Frequency of Seeing Bad Versions - now with traffic data

2009-08-28 Thread Anthony
On Fri, Aug 28, 2009 at 3:44 PM, Lars Aronsson l...@aronsson.se wrote:

 We can try to find out which edits are reverts, assuming that the
 previous edit was an act of vandalism.


But that's a bad assumption.  It gives both false positives and false
negatives, and it gives a significant number of each.  I gave examples of
each above.  My samples were tiny, but 38% of reverts were not reverts of
vandalism, and 40% of vandalism was not reverted by a means detected by this
strategy.  And there is no reason to believe that the error is consistent
over time, so these numbers are useless when it comes to determining whether
or not the problem is increasing.

That way we can conclude
 which articles were vandalized and how long it took to revert
 them.


Your simplistic version of assuming that the previous edit was an act of
vandalism makes the conclusion of how long it took to revert pretty
obviously flawed, doesn't it?  In your simplistic assumption (which is even
worse than the one used by Robert), you're simply measuring the average time
between edits.  Any acts of vandalism which take more than one edit to find
and fix are excluded.

Now Robert's methodology wasn't quite that bad.  It allowed for reverts
separated by one or more other edits.   But it had no way to detect an act
of vandalism which lasted for hundreds of edits, was discovered by someone
reading the text, and was removed without reference to the original edit
with an edit summary such as Barrack Obama was born in Hawaii.  And these
acts of vandalism are the worst.  They last the longest, they do the most
harm when they are read, they get the most views, etc.  Any methodology
which excludes them is systemically biased.
___
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


Re: [Foundation-l] Frequency of Seeing Bad Versions - now with traffic data

2009-08-27 Thread Andrew Turvey
very interesting research - many thanks for sharing that. 

- Robert Rohde raro...@gmail.com wrote: 
 From: Robert Rohde raro...@gmail.com 
 To: Wikimedia Foundation Mailing List foundation-l@lists.wikimedia.org 
 Sent: Thursday, 27 August, 2009 17:41:29 GMT +00:00 GMT Britain, Ireland, 
 Portugal 
 Subject: [Foundation-l] Frequency of Seeing Bad Versions - now with traffic 
 data 
 
 Recently, I reported on a simple study of how likely one was to 
 encounter recent vandalism in Wikipedia based on selecting articles at 
 random and using revert behavior as a proxy for recent vandalism. 
 
 http://lists.wikimedia.org/pipermail/foundation-l/2009-August/054171.html 
 
 One of the key limitations of that work was that it was looking at 
 articles selected at random from the pool of all existing page titles. 
 That approach was of the most immediate interest to me, but it didn't 
 directly address the likelihood of encountering vandalism based on the 
 way that Wikipedia is actually used because the selection of articles 
 that people choose to visit is highly non-random. 
 
 I've now redone that analysis with a crude traffic based weighting. 
 For traffic information I used the same data stream used by 
 http://stats.grok.se. That data is recorded hourly. For simplicity I 
 chose 20 hours at random from the last eight months and averaged those 
 together to get a rough picture of the relative prominence of pages. 
 I then chose a selection of 3 articles at random with their 
 probability of selection proportional to the traffic they received, 
 and repeated the prior analysis previously described. (Note that this 
 has the effect of treating the prominence of each page as a constant 
 over time. In practice we know some pages rise to prominence while 
 other fall down, but I am assuming the average pattern is still a good 
 enough approximation to be useful.) 
 
 From this sample I found 5,955,236 revert events in 38,096,653 edits. 
 This is an increase of 29 times in edit frequency and 58 times the 
 number of revert events that were found from a uniform sampling of 
 pages. I suspect it surprises no one that highly trafficked pages are 
 edited more often and subject to more vandalism than the average page, 
 though it might not have been obvious that the the ratio of reverts to 
 normal edits is also increased over more obscure pages. 
 
 As before, the revert time distribution has a very long tail, though 
 as predicted the times are generally reduced when traffic weighting is 
 applied. In the traffic weighted sample, the median time to revert is 
 3.4 minutes and the mean time is 2.2 hours (compared to 6.7 minutes 
 and 18.2 hours with uniform weighting). Again, I think it is worth 
 acknowledging that having a majority of reverts occur within only a 
 few minutes is a strong testament to the efficiency and dedication 
 with which new edits are usually reviewed by the community. We could 
 be much worse off if most things weren't caught so quickly. 
 
 Unfortunately, in comparing the current analysis to the previous one, 
 the faster response time is essentially being overwhelmed by the much 
 larger number of vandalism occurrences. The net result is that 
 averaged over the whole history of Wikipedia a visitor would be 
 expected to receive a recently degraded article version during about 
 1.1% of requests (compared to ~0.37% in the uniform weighting 
 estimate). The last six months averaged a slightly higher 1.3% (1 in 
 80 requests). As before, most of the degraded content that people are 
 likely to actually encounter is coming from the subset of things that 
 get by the initial monitors and survive for a long time. Among edits 
 that are eventually reverted the longest lasting 5% of bad content 
 (those edits taking  7.2 hours to revert) is responsible for 78% of 
 the expected encounters with recently degraded material. One might 
 speculate that such long-lived material is more likely to reflect 
 subtle damage to a page rather than more obvious problems like page 
 blanking. I did not try to investigate this. 
 
 In my sample, the number of reverts being made to articles has 
 declined ~40% since a peak in late 2006. However, the mean and median 
 time to revert is little changed over the last two years. What little 
 trend exists points in the direction of slightly slower responses. 
 
 
 So to summarize, the results here are qualitatively similar to those 
 found in the previous work. However with traffic weighting we find 
 quantitative differences such that reverts occur much more often but 
 take less time to be executed. The net effect of these competing 
 factors is such that the bad content is more likely to be seen than 
 suggested by the uniform weighting. 
 
 -Robert Rohde 
 
 ___ 
 foundation-l mailing list 
 foundation-l@lists.wikimedia.org 
 Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l 
 

Re: [Foundation-l] Frequency of Seeing Bad Versions - now with traffic data

2009-08-27 Thread Robert Rohde
I've just read two different news stories on Flagged Revisions that
described vandalism as a growing problem for Wikipedia.

With that in mind, I would like to highlight one specific point in the
analysis I just did.

The frequency of reverts to articles -- as a fraction of total edits
-- has remained virtually constant for almost three years now.  There
is no evidence that the community is making reverts more often today
(relative to total edits) than we were in 2007.

Hence, I would suggest that describing vandalism as a growing
problem is probably erroneous with respect to actual editing
behaviors.  Maybe our concern for ensuring accuracy and addressing
vandalism has grown, but the scale of the underlying problem of
incoming vandalism appears to be more or less constant.

-Robert Rohde

___
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


Re: [Foundation-l] Frequency of Seeing Bad Versions - now with traffic data

2009-08-27 Thread Anthony
1:00 edit1:02 revert
1:06 revert
1:14 revert
1:30 revert
2:02 revert

How many instances of vandalism does your program count there, and what is
the mean and median time to revert?
___
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


Re: [Foundation-l] Frequency of Seeing Bad Versions - now with traffic data

2009-08-27 Thread Anthony
On Thu, Aug 27, 2009 at 2:40 PM, Robert Rohde raro...@gmail.com wrote:

 I've just read two different news stories on Flagged Revisions that
 described vandalism as a growing problem for Wikipedia.

 With that in mind, I would like to highlight one specific point in the
 analysis I just did.

 The frequency of reverts to articles -- as a fraction of total edits
 -- has remained virtually constant for almost three years now.  There
 is no evidence that the community is making reverts more often today
 (relative to total edits) than we were in 2007.

 Hence, I would suggest that describing vandalism as a growing
 problem is probably erroneous with respect to actual editing
 behaviors.  Maybe our concern for ensuring accuracy and addressing
 vandalism has grown, but the scale of the underlying problem of
 incoming vandalism appears to be more or less constant.


Why do you assume that number of reverts has any correlation with amount of
vandalism?  Has this been studied?
___
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


Re: [Foundation-l] Frequency of Seeing Bad Versions - now with traffic data

2009-08-27 Thread Thomas Dalton
2009/8/27 Anthony wikim...@inbox.org:
 Why do you assume that number of reverts has any correlation with amount of
 vandalism?  Has this been studied?

It seems to be a sensible assumption, although checking it would be
wise. I would put money on a significant majority of reverts being
reverts of vandalism rather than BRD reverts, it may not be an
overwhelming majority, though.

___
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


Re: [Foundation-l] Frequency of Seeing Bad Versions - now with traffic data

2009-08-27 Thread Anthony
On Thu, Aug 27, 2009 at 2:50 PM, Thomas Dalton thomas.dal...@gmail.comwrote:

 2009/8/27 Anthony wikim...@inbox.org:
  Why do you assume that number of reverts has any correlation with amount
 of
  vandalism?  Has this been studied?

 It seems to be a sensible assumption, although checking it would be
 wise.


It seems to me to be begging the question.  You don't answer the question
how bad is vandalism by assuming that vandalism is generally reverted.

I would put money on a significant majority of reverts being
 reverts of vandalism rather than BRD reverts, it may not be an
 overwhelming majority, though.


I don't know about that, though I won't take the other end of the bet.  Have
you done much editing while not logged in?  If so, I think you have to admit
that it's quite common to find yourself reverted for things which are not
properly classified as vandalism.

However, that's only one half of the equation.  The other half is how many
instances of vandalism are not reverted, and how many are not reverted in
a way that is detected by this program.
___
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


Re: [Foundation-l] Frequency of Seeing Bad Versions - now with traffic data

2009-08-27 Thread Anthony
On Thu, Aug 27, 2009 at 2:58 PM, Anthony wikim...@inbox.org wrote:

 On Thu, Aug 27, 2009 at 2:50 PM, Thomas Dalton thomas.dal...@gmail.comwrote:

 I would put money on a significant majority of reverts being
 reverts of vandalism rather than BRD reverts, it may not be an
 overwhelming majority, though.


 I don't know about that, though I won't take the other end of the bet.
  Have you done much editing while not logged in?  If so, I think you have to
 admit that it's quite common to find yourself reverted for things which are
 not properly classified as vandalism.


Just going through recent changes looking for rv (which is not the only
thing detected by Robert's software, and is probably the most likely to be
actual vandalism)...

http://en.wikipedia.org/w/index.php?title=Smallpoxcurid=16829895diff=310413006oldid=310405829
(content
dispute)
http://en.wikipedia.org/w/index.php?title=View_Askewniversecurid=2163851diff=310412615oldid=310412247
(blanking
vandalism)
http://en.wikipedia.org/w/index.php?title=Barbecuecurid=37135diff=310412401oldid=310410035
(spelling
dispute)
http://en.wikipedia.org/w/index.php?title=Sino-American_relationscurid=277880diff=310412381oldid=310329859
(revert
of POV edits, I guess that counts as vandalism by Robert's definition)
http://en.wikipedia.org/w/index.php?title=Secessioncurid=144732diff=310412005oldid=310406662
(I
have no idea, I guess this one qualifies)
http://en.wikipedia.org/w/index.php?title=The_Underdog_Projectcurid=1436277diff=310412002oldid=308833810
(test
edit, qualifies)
http://en.wikipedia.org/w/index.php?title=Visual_communicationcurid=669120diff=310411952oldid=310411398
(I'm
going to call this a content dispute though you may disagree)
http://en.wikipedia.org/w/index.php?title=Technical_communicationcurid=1219401diff=310411937oldid=310410621
 (ditto)
http://en.wikipedia.org/w/index.php?title=Caroline_Ahernecurid=514223diff=310411860oldid=310328710
(removal
of POV, qualifies)
http://en.wikipedia.org/w/index.php?title=Mario_Kart_Wiicurid=12205924diff=310411680oldid=310401913
(vandalism,
I think)
http://en.wikipedia.org/w/index.php?title=Hephaestuscurid=14388diff=310411384oldid=310396007
 (vandalism)
http://en.wikipedia.org/w/index.php?title=List_of_pop_punk_bandscurid=4770362diff=310410857oldid=310410740
(looks
like a content dispute)
http://en.wikipedia.org/w/index.php?title=Korn's_ninth_studio_albumcurid=21855821diff=310410677oldid=310381982
(content
dispute)
http://en.wikipedia.org/w/index.php?title=Kinetic_energycurid=17327diff=310410573oldid=310391734
 (vandalism)
http://en.wikipedia.org/w/index.php?title=List_of_best-selling_Wii_video_gamescurid=21469202diff=310410431oldid=310395902
(seems
to be reversion of a legitimate edit)
http://en.wikipedia.org/w/index.php?title=Teleological_argumentcurid=30731diff=310410174oldid=310399980
(content
dispute)
http://en.wikipedia.org/w/index.php?title=Nick_Swardsoncurid=3630190diff=310410089oldid=310410013
 (vandalism)
http://en.wikipedia.org/w/index.php?title=Jose_Cansecocurid=175552diff=310409931oldid=310408069
(vandalism,
I guess)
http://en.wikipedia.org/w/index.php?title=Ola_Moumcurid=8083232diff=310409846oldid=310396138
(content
dispute)
http://en.wikipedia.org/w/index.php?title=Kareli,_Georgiacurid=18661674diff=310409393oldid=310348062
(vandalism,
I think)
http://en.wikipedia.org/w/index.php?title=Victoria_Justicecurid=2662543diff=310412751oldid=310411603
(I
guess it's technically a BLP violation, so qualifies)

13/21=62% actual vandalism, though I'm sure 80 people will now proceed to
dispute my categorizations.

Robert, let's get a random sample of the actual reverts your program
found...
___
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


Re: [Foundation-l] Frequency of Seeing Bad Versions - now with traffic data

2009-08-27 Thread Chad
On Thu, Aug 27, 2009 at 3:33 PM, Anthonywikim...@inbox.org wrote:
 On Thu, Aug 27, 2009 at 2:58 PM, Anthony wikim...@inbox.org wrote:

 On Thu, Aug 27, 2009 at 2:50 PM, Thomas Dalton 
 thomas.dal...@gmail.comwrote:

 I would put money on a significant majority of reverts being
 reverts of vandalism rather than BRD reverts, it may not be an
 overwhelming majority, though.


 I don't know about that, though I won't take the other end of the bet.
  Have you done much editing while not logged in?  If so, I think you have to
 admit that it's quite common to find yourself reverted for things which are
 not properly classified as vandalism.


 Just going through recent changes looking for rv (which is not the only
 thing detected by Robert's software, and is probably the most likely to be
 actual vandalism)...

 http://en.wikipedia.org/w/index.php?title=Smallpoxcurid=16829895diff=310413006oldid=310405829
 (content
 dispute)
 http://en.wikipedia.org/w/index.php?title=View_Askewniversecurid=2163851diff=310412615oldid=310412247
 (blanking
 vandalism)
 http://en.wikipedia.org/w/index.php?title=Barbecuecurid=37135diff=310412401oldid=310410035
 (spelling
 dispute)
 http://en.wikipedia.org/w/index.php?title=Sino-American_relationscurid=277880diff=310412381oldid=310329859
 (revert
 of POV edits, I guess that counts as vandalism by Robert's definition)
 http://en.wikipedia.org/w/index.php?title=Secessioncurid=144732diff=310412005oldid=310406662
 (I
 have no idea, I guess this one qualifies)
 http://en.wikipedia.org/w/index.php?title=The_Underdog_Projectcurid=1436277diff=310412002oldid=308833810
 (test
 edit, qualifies)
 http://en.wikipedia.org/w/index.php?title=Visual_communicationcurid=669120diff=310411952oldid=310411398
 (I'm
 going to call this a content dispute though you may disagree)
 http://en.wikipedia.org/w/index.php?title=Technical_communicationcurid=1219401diff=310411937oldid=310410621
  (ditto)
 http://en.wikipedia.org/w/index.php?title=Caroline_Ahernecurid=514223diff=310411860oldid=310328710
 (removal
 of POV, qualifies)
 http://en.wikipedia.org/w/index.php?title=Mario_Kart_Wiicurid=12205924diff=310411680oldid=310401913
 (vandalism,
 I think)
 http://en.wikipedia.org/w/index.php?title=Hephaestuscurid=14388diff=310411384oldid=310396007
  (vandalism)
 http://en.wikipedia.org/w/index.php?title=List_of_pop_punk_bandscurid=4770362diff=310410857oldid=310410740
 (looks
 like a content dispute)
 http://en.wikipedia.org/w/index.php?title=Korn's_ninth_studio_albumcurid=21855821diff=310410677oldid=310381982
 (content
 dispute)
 http://en.wikipedia.org/w/index.php?title=Kinetic_energycurid=17327diff=310410573oldid=310391734
  (vandalism)
 http://en.wikipedia.org/w/index.php?title=List_of_best-selling_Wii_video_gamescurid=21469202diff=310410431oldid=310395902
 (seems
 to be reversion of a legitimate edit)
 http://en.wikipedia.org/w/index.php?title=Teleological_argumentcurid=30731diff=310410174oldid=310399980
 (content
 dispute)
 http://en.wikipedia.org/w/index.php?title=Nick_Swardsoncurid=3630190diff=310410089oldid=310410013
  (vandalism)
 http://en.wikipedia.org/w/index.php?title=Jose_Cansecocurid=175552diff=310409931oldid=310408069
 (vandalism,
 I guess)
 http://en.wikipedia.org/w/index.php?title=Ola_Moumcurid=8083232diff=310409846oldid=310396138
 (content
 dispute)
 http://en.wikipedia.org/w/index.php?title=Kareli,_Georgiacurid=18661674diff=310409393oldid=310348062
 (vandalism,
 I think)
 http://en.wikipedia.org/w/index.php?title=Victoria_Justicecurid=2662543diff=310412751oldid=310411603
 (I
 guess it's technically a BLP violation, so qualifies)

 13/21=62% actual vandalism, though I'm sure 80 people will now proceed to
 dispute my categorizations.

 Robert, let's get a random sample of the actual reverts your program
 found...
 ___
 foundation-l mailing list
 foundation-l@lists.wikimedia.org
 Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


/rvv?|revert(ing)?[ ]*(vandal(ism)?)?/

Might give you a slightly wider sample.

-Chad

___
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


Re: [Foundation-l] Frequency of Seeing Bad Versions - now with traffic data

2009-08-27 Thread Alex
Anthony wrote:
 On Thu, Aug 27, 2009 at 2:58 PM, Anthony wikim...@inbox.org wrote:
 
 On Thu, Aug 27, 2009 at 2:50 PM, Thomas Dalton 
 thomas.dal...@gmail.comwrote:

 I would put money on a significant majority of reverts being
 reverts of vandalism rather than BRD reverts, it may not be an
 overwhelming majority, though.

 I don't know about that, though I won't take the other end of the bet.
  Have you done much editing while not logged in?  If so, I think you have to
 admit that it's quite common to find yourself reverted for things which are
 not properly classified as vandalism.

 
 Just going through recent changes looking for rv (which is not the only
 thing detected by Robert's software, and is probably the most likely to be
 actual vandalism)...
 

Most vandalism reversion on enwiki (I believe) is done with automated
tools and/or rollback rather than manual reversion.

They typically leave more detailed summaries:
Reverted N edits by X identified as vandalism to last revision by Y
Reverted edits by X (talk) to last version by Y

-- 
Alex (wikipedia:en:User:Mr.Z-man)

___
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


Re: [Foundation-l] Frequency of Seeing Bad Versions - now with traffic data

2009-08-27 Thread Anthony
On Thu, Aug 27, 2009 at 3:45 PM, Chad innocentkil...@gmail.com wrote:

 /rvv?|revert(ing)?[ ]*(vandal(ism)?)?/

 Might give you a slightly wider sample.


I'll wait for Robert to release a random sample of edits he actually
identified as reverts and/or the actual scripts and data dump he used.
___
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


Re: [Foundation-l] Frequency of Seeing Bad Versions - now with traffic data

2009-08-27 Thread Stephen Bain
On Fri, Aug 28, 2009 at 4:58 AM, Anthonywikim...@inbox.org wrote:

 It seems to me to be begging the question.  You don't answer the question
 how bad is vandalism by assuming that vandalism is generally reverted.

Can you suggest a better metric then?

-- 
Stephen Bain
stephen.b...@gmail.com

___
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


Re: [Foundation-l] Frequency of Seeing Bad Versions - now with traffic data

2009-08-27 Thread Anthony
On Thu, Aug 27, 2009 at 7:58 PM, Stephen Bain stephen.b...@gmail.comwrote:

 On Fri, Aug 28, 2009 at 4:58 AM, Anthonywikim...@inbox.org wrote:
 
  It seems to me to be begging the question.  You don't answer the question
  how bad is vandalism by assuming that vandalism is generally reverted.

 Can you suggest a better metric then?


I must admit I don't understand the question.
___
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


Re: [Foundation-l] Frequency of Seeing Bad Versions - now with traffic data

2009-08-27 Thread Thomas Dalton
2009/8/28 Anthony wikim...@inbox.org:
 On Thu, Aug 27, 2009 at 7:58 PM, Stephen Bain stephen.b...@gmail.comwrote:

 On Fri, Aug 28, 2009 at 4:58 AM, Anthonywikim...@inbox.org wrote:
 
  It seems to me to be begging the question.  You don't answer the question
  how bad is vandalism by assuming that vandalism is generally reverted.

 Can you suggest a better metric then?


 I must admit I don't understand the question.

He means what would you measure in order to draw conclusions about the
severity of vandalism.

___
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


Re: [Foundation-l] Frequency of Seeing Bad Versions - now with traffic data

2009-08-27 Thread Anthony
On Thu, Aug 27, 2009 at 8:24 PM, Thomas Dalton thomas.dal...@gmail.comwrote:

 2009/8/28 Anthony wikim...@inbox.org:
  On Thu, Aug 27, 2009 at 7:58 PM, Stephen Bain stephen.b...@gmail.com
 wrote:
 
  On Fri, Aug 28, 2009 at 4:58 AM, Anthonywikim...@inbox.org wrote:
  
   It seems to me to be begging the question.  You don't answer the
 question
   how bad is vandalism by assuming that vandalism is generally
 reverted.
 
  Can you suggest a better metric then?
 
 
  I must admit I don't understand the question.

 He means what would you measure in order to draw conclusions about the
 severity of vandalism.


Umm...you would count the number of instances of vandalism?

Is the question how to objectively *define* vandalism?

If not, I still don't understand the question.
___
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


Re: [Foundation-l] Frequency of Seeing Bad Versions - now with traffic data

2009-08-27 Thread Gregory Maxwell
On Thu, Aug 27, 2009 at 8:24 PM, Thomas Daltonthomas.dal...@gmail.com wrote:
 2009/8/28 Anthony wikim...@inbox.org:
 On Thu, Aug 27, 2009 at 7:58 PM, Stephen Bain stephen.b...@gmail.comwrote:
 On Fri, Aug 28, 2009 at 4:58 AM, Anthonywikim...@inbox.org wrote:
  It seems to me to be begging the question.  You don't answer the question
  how bad is vandalism by assuming that vandalism is generally reverted.
 Can you suggest a better metric then?
 I must admit I don't understand the question.

 He means what would you measure in order to draw conclusions about the
 severity of vandalism.

The obvious methodology would be to take a large random sample and
hand classify it. It's not rocket science.

By having multiple people perform the classification you could measure
the confidence of the classification.

This is somewhat labor intensive, but only somewhat as it doesn't take
an inordinate number of samples to produce representative results.
This should be the gold standard for this kind of measurement as it
would be much closer to what people actually want to know than most
machine metrics.

If the results of this kind of study have good agreement with
mechanical proxy metrics (such as machine detected vandalism) our
confidence in those proxies will increase, if they disagree it will
provide an opportunity to improve the proxies.

These are techniques widely used in other fields.

___
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


Re: [Foundation-l] Frequency of Seeing Bad Versions - now with traffic data

2009-08-27 Thread Thomas Dalton
2009/8/28 Anthony wikim...@inbox.org:
 He means what would you measure in order to draw conclusions about the
 severity of vandalism.


 Umm...you would count the number of instances of vandalism?

That's not practical. That would require a person to go through
article histories revision by revision, probably multiple people per
article to check they agreed on what was vandalism. It won't work for
the kind of sample sizes required unless you get an army of
volunteers. We need something that we expect to strongly correlate
with the number of instances of vandalism but is easier to measure -
that is what revisions with revert/rvv/etc. in the edit summary was
intended to be.

___
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


Re: [Foundation-l] Frequency of Seeing Bad Versions - now with traffic data

2009-08-27 Thread Thomas Dalton
2009/8/28 Gregory Maxwell gmaxw...@gmail.com:
 This is somewhat labor intensive, but only somewhat as it doesn't take
 an inordinate number of samples to produce representative results.
 This should be the gold standard for this kind of measurement as it
 would be much closer to what people actually want to know than most
 machine metrics.

To get a fair sample we would need to include some highly active
pages. They have ridiculous numbers of revisions (even if you restrict
it to the last few months).

 If the results of this kind of study have good agreement with
 mechanical proxy metrics (such as machine detected vandalism) our
 confidence in those proxies will increase, if they disagree it will
 provide an opportunity to improve the proxies.

This kind of intensive study on a few small sample with a more
automated method used on the same sample to compare would be more
achievable. If the automated method gets similar results, we can use
that method for larger samples.

___
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


Re: [Foundation-l] Frequency of Seeing Bad Versions - now with traffic data

2009-08-27 Thread Anthony
On Thu, Aug 27, 2009 at 8:36 PM, Thomas Dalton thomas.dal...@gmail.comwrote:

 2009/8/28 Anthony wikim...@inbox.org:
  He means what would you measure in order to draw conclusions about the
  severity of vandalism.
 
 
  Umm...you would count the number of instances of vandalism?

 That's not practical.


I never said it was practical, I just said that counting revisions and
calling that counting vandalism is incorrect.
___
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


Re: [Foundation-l] Frequency of Seeing Bad Versions - now with traffic data

2009-08-27 Thread Thomas Dalton
2009/8/28 Anthony wikim...@inbox.org:
 On Thu, Aug 27, 2009 at 8:36 PM, Thomas Dalton thomas.dal...@gmail.comwrote:

 2009/8/28 Anthony wikim...@inbox.org:
  He means what would you measure in order to draw conclusions about the
  severity of vandalism.
 
 
  Umm...you would count the number of instances of vandalism?

 That's not practical.


 I never said it was practical, I just said that counting revisions and
 calling that counting vandalism is incorrect.

And you were asked to suggest a better approach. Nobody claimed it was perfect.

___
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


Re: [Foundation-l] Frequency of Seeing Bad Versions - now with traffic data

2009-08-27 Thread Anthony
On Thu, Aug 27, 2009 at 8:41 PM, Thomas Dalton thomas.dal...@gmail.comwrote:

 2009/8/28 Anthony wikim...@inbox.org:
  On Thu, Aug 27, 2009 at 8:36 PM, Thomas Dalton thomas.dal...@gmail.com
 wrote:
 
  2009/8/28 Anthony wikim...@inbox.org:
   He means what would you measure in order to draw conclusions about
 the
   severity of vandalism.
  
  
   Umm...you would count the number of instances of vandalism?
 
  That's not practical.
 
 
  I never said it was practical, I just said that counting revisions and
  calling that counting vandalism is incorrect.

 And you were asked to suggest a better approach. Nobody claimed it was
 perfect.


I suggested a better approach last time we had this thread: statistical
sampling.

And I'm saying much more than that this method is imperfect.  I'm saying
it's fundamentally flawed when it comes to measuring vandalism.  It measures
something much different than vandalism.

When it comes to answering the question of how likely is one to encounter
vandalism, I am no more informed after reading this thread than before.  It
could be 0.5% and I wouldn't be surprised.  It could be 3% and I wouldn't be
surprised.  The methods used in this study both undercount and overcount
vandalism, possibly quite significantly.  Not all reverts are reverts of
vandalism.  I wouldn't be surprised if only 50% of them are.  And not all
vandalism is reverted.  As revert is defined by this method, I wouldn't be
surprised if 75% of vandalism is not detected.  This study doesn't measure
vandalism.
___
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


Re: [Foundation-l] Frequency of Seeing Bad Versions - now with traffic data

2009-08-27 Thread Thomas Dalton
2009/8/28 Anthony wikim...@inbox.org:
 I suggested a better approach last time we had this thread: statistical
 sampling.

This research was based on a sample. What are you talking about?

___
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


Re: [Foundation-l] Frequency of Seeing Bad Versions - now with traffic data

2009-08-27 Thread Anthony
Just took a quick sample of 10 instances of vandalism to [[Ted Stevens]].
 Of those 10 instances of vandalism, either 2 or 4 would not have been found
by the automated tool described.  2 if every edit summary containing the
word vandalism is counted as vandalism, and 4 if not.  The former would
probably significantly overcount vandalism.

http://en.wikipedia.org/w/index.php?title=Ted_Stevensdiff=173527553oldid=173381871
(Removed
vandalism)
http://en.wikipedia.org/w/index.php?title=Ted_Stevensdiff=180054904oldid=179982198
(rmv
vandalism)
http://en.wikipedia.org/w/index.php?title=Ted_Stevensdiff=168486242oldid=168438600
no
edit summary
http://en.wikipedia.org/w/index.php?title=Ted_Stevensdiff=162332870oldid=162038733
(yes
it is funny, but this doesn't belong here)

On Thu, Aug 27, 2009 at 9:31 PM, Thomas Dalton thomas.dal...@gmail.com
 wrote:

 2009/8/28 Anthony wikim...@inbox.org:
  I suggested a better approach last time we had this thread: statistical
  sampling.

 This research was based on a sample. What are you talking about?


I'm talking about taking a sample and examining it manually.  First, spend a
few weeks coming up with an objective definition of vandalism.  Then pick
5,000 random article views from the http log, and publish the URL/date/time.
 Then advertise the list all over the place (especially on sites like
Wikipedia Review) asking people to find instances of vandalism in it.
 People can use automated means which they then go through by hand to remove
false positives, manual error checking, spot checking, whatever.  The number
of confirmed instances of vandalism will grow for a while, and eventually
will start to level off.

May not be perfect, but it'll provide a lower bound on the amount of
vandalism, at least.  Have a statistician tell us what our exact error
bounds are.  And then prepare for a second study, improving on everything
(the definition of vandalism, the number of random article views, the
amount of time to wait) based on what we learned.
___
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


Re: [Foundation-l] Frequency of Seeing Bad Versions - now with traffic data

2009-08-27 Thread Nathan
On Thu, Aug 27, 2009 at 9:47 PM, Anthony wikim...@inbox.org wrote:

 Just took a quick sample of 10 instances of vandalism to [[Ted Stevens]].
  Of those 10 instances of vandalism, either 2 or 4 would not have been
 found
 by the automated tool described.  2 if every edit summary containing the
 word vandalism is counted as vandalism, and 4 if not.  The former would
 probably significantly overcount vandalism.


 http://en.wikipedia.org/w/index.php?title=Ted_Stevensdiff=173527553oldid=173381871
 (Removed
 vandalism)

 http://en.wikipedia.org/w/index.php?title=Ted_Stevensdiff=180054904oldid=179982198
 (rmv
 vandalism)

 http://en.wikipedia.org/w/index.php?title=Ted_Stevensdiff=168486242oldid=168438600
 no
 edit summary

 http://en.wikipedia.org/w/index.php?title=Ted_Stevensdiff=162332870oldid=162038733
 (yes
 it is funny, but this doesn't belong here)

 On Thu, Aug 27, 2009 at 9:31 PM, Thomas Dalton thomas.dal...@gmail.com
  wrote:

  2009/8/28 Anthony wikim...@inbox.org:
   I suggested a better approach last time we had this thread: statistical
   sampling.
 
  This research was based on a sample. What are you talking about?


 I'm talking about taking a sample and examining it manually.  First, spend
 a
 few weeks coming up with an objective definition of vandalism.  Then pick
 5,000 random article views from the http log, and publish the
 URL/date/time.
  Then advertise the list all over the place (especially on sites like
 Wikipedia Review) asking people to find instances of vandalism in it.
  People can use automated means which they then go through by hand to
 remove
 false positives, manual error checking, spot checking, whatever.  The
 number
 of confirmed instances of vandalism will grow for a while, and eventually
 will start to level off.

 May not be perfect, but it'll provide a lower bound on the amount of
 vandalism, at least.  Have a statistician tell us what our exact error
 bounds are.  And then prepare for a second study, improving on everything
 (the definition of vandalism, the number of random article views, the
 amount of time to wait) based on what we learned.



Out of curiosity, Anthony, do you still refrain from editing Wikimedia
projects over licensing
issues? How long has it been, a year?

Nathan
___
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


Re: [Foundation-l] Frequency of Seeing Bad Versions - now with traffic data

2009-08-27 Thread Anthony
On Thu, Aug 27, 2009 at 10:07 PM, Nathan nawr...@gmail.com wrote:

 Out of curiosity, Anthony, do you still refrain from editing Wikimedia
 projects over licensing
 issues? How long has it been, a year?


I guess now is as good a time as any to admit it.  I started editing again,
without logging in, about a month ago.  I made a couple dozen or so edits in
since.  Prior to that my last edit was October 20, 2008.
___
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


Re: [Foundation-l] Frequency of Seeing Bad Versions - now with traffic data

2009-08-27 Thread Brion Vibber
On 8/27/09 9:39 PM, Thomas Dalton wrote:
 2009/8/28 Gregory Maxwellgmaxw...@gmail.com:
 If the results of this kind of study have good agreement with
 mechanical proxy metrics (such as machine detected vandalism) our
 confidence in those proxies will increase, if they disagree it will
 provide an opportunity to improve the proxies.

 This kind of intensive study on a few small sample with a more
 automated method used on the same sample to compare would be more
 achievable. If the automated method gets similar results, we can use
 that method for larger samples.

I would certainly be interested in seeing such a result.

Generally speaking we can expect a strong correlation between vandalism 
and machine-identifiable reverts -- it's a totally reasonable assumption 
for a first-order approximation -- and it would be valuable to confirm 
this and see how much divergence there might be between this count and 
other markers.

Most interesting following this would be take into account the effects 
of flagged revisions and how this could affect initially-displayed vs 
edited revisions. Has there been similar work targeting German-language 
Wikipedia already?

Robert, is it possible to share the source for generating the 
revert-based stats with other folks who may be interested in pursuing 
further work on the subject? Thanks!

-- brion vibber (brion @ wikimedia.org)
CTO  Senior Software Architect, Wikimedia Foundation

___
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l