Re: [Foundation-l] Frequency of Seeing Bad Versions - now with traffic data
On Fri, Aug 28, 2009 at 12:43 AM, Brion Vibber br...@wikimedia.org wrote: On 8/27/09 9:39 PM, Thomas Dalton wrote: 2009/8/28 Gregory Maxwellgmaxw...@gmail.com: If the results of this kind of study have good agreement with mechanical proxy metrics (such as machine detected vandalism) our confidence in those proxies will increase, if they disagree it will provide an opportunity to improve the proxies. This kind of intensive study on a few small sample with a more automated method used on the same sample to compare would be more achievable. If the automated method gets similar results, we can use that method for larger samples. I would certainly be interested in seeing such a result. Can you get us 5000 random article views from the http log made during the first half of 2009? All we need is URL/date/time. Everything else can be blanked for anonymizing. It can be from a 1/10th log or whatever. The list should consist solely of *views*, not edits, and only of articles. All the rest of the data is out there, unless we happen to hit on a deleted/oversighted revision. But using http://dammit.lt/wikistats/ to estimate the hits is less accurate. Many popular pages get popular suddenly, and then quickly fade away. There is most likely a strong correlation to the amount of vandalism that takes place while they are popular to the amount of vandalism that takes place while they are not popular, so I'd much prefer a sample from the actual http log. If we can't get the real thing, I'll start downloading from http://dammit.lt/wikistats/ and generate an estimated one, though. Once we have the list, anyone is free to examine it any way they want, and show their results. But we're talking about probably less than 200 instances of vandalism here, so it'll be quite easy (and fun) to lambaste anyone whose methods produce false positives. If you're going to do it, maybe we should work on a rough-consensus objective definition of vandalism before you release the file, though... ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: [Foundation-l] Frequency of Seeing Bad Versions - now with traffic data
On Fri, Aug 28, 2009 at 10:08 AM, Thomas Dalton thomas.dal...@gmail.comwrote: 2009/8/28 Anthony wikim...@inbox.org: If you're going to do it, maybe we should work on a rough-consensus objective definition of vandalism before you release the file, though... Don't we have a consensus definition already? Vandalism is bad faith editing. You may also want to include test edits since they are treated in the same way (just with different warning messages). That isn't objective, but it should be close enough. We can argue over a few borderline cases. Well, it relies on information (intent) which we can't determine simply from the content of the edit (sometimes it is implied if you look at the entire behavior of the user, but that's too messy). Is a POV edit vandalism? I think it has to be treated as such, at least some of the time (Windows is the worst operating system ever), but there are certainly edits which are clearly POV but the intent is unclear (many people don't know the rules). We need to remove intent from the definition, and I suppose call it degraded articles. But simply saying that anything POV is vandalism would potentially include just about any large article. I suppose we can just list everything that's arguably vandalism and then categorize it later though. I expect we'll come up with several different final numbers, which I guess is okay (the only part that really needs to be pristinely unbiased is the selection of pageviews), though I do expect some people will adapt their definition of vandalism to fit the data. I support the request for 5000 random pageviews (uniform distribution by pageview over the last 6 months) from the logs. Seems like it could be reused for a lot of different types of studies, so long as the researcher isn't exposed to the details of the urls before coming up with his/her methodology. And I think the analysis of those 5000 pageviews in all sorts of ways would crowdsource well. I'd love to see a Nature Study equivalent, analyzing the more subjective aspects of the articles in addition to just plain old vandalized/not-vandalized. If we can't get the 5000 random pageviews (do the logs even still exist?), I suppose wikistats will do. They have pageviews broken down by hour, so the non-uniformity of a single hour is probably fairly small for the popular pages most likely to be selected. Worst part is that it's a whole lot of data to download, and I'm not sure any shortcuts can be taken without screwing up the non-uniformity. I considered just downloading the projectcounts and then selecting the date-hours weighted accordingly then downloading only the date-hour files needed, but that does potentially introduce error if the non-article traffic isn't well correlated to the article traffic, so I dunno. Probably a safe assumption that they are well correlated, but I'd rather not guess. Maybe talk-page traffic is highly correlated to increased vandalism, or decreased vandalism. It's possible, so I'd rather be safe. ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: [Foundation-l] Frequency of Seeing Bad Versions - now with traffic data
On Thu, Aug 27, 2009 at 9:43 PM, Brion Vibberbr...@wikimedia.org wrote: snip Robert, is it possible to share the source for generating the revert-based stats with other folks who may be interested in pursuing further work on the subject? Thanks! Not as a complete stand-alone entity. The analysis framework I through together for this has closed-source dependencies. I may help with partial code or pseudocode though. -Robert ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: [Foundation-l] Frequency of Seeing Bad Versions - now with traffic data
On Fri, Aug 28, 2009 at 3:55 AM, Anthonywikim...@inbox.org wrote: snip Once we have the list, anyone is free to examine it any way they want, and show their results. But we're talking about probably less than 200 instances of vandalism here, so it'll be quite easy (and fun) to lambaste anyone whose methods produce false positives. Comments like this discourage people like me from putting in the time and effort to do this sort of work. Offering constructive criticism is one thing, but looking forward to the fun of lambast[ing] the good faith efforts of others is offensive and not in keeping with the collaborative spirit necessary to run WMF projects. -Robert Rohde ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: [Foundation-l] Frequency of Seeing Bad Versions - now with traffic data
Anthony wrote: Umm...you would count the number of instances of vandalism? Is the question how to objectively *define* vandalism? On one hand, we have a perception, as expressed by media (and by CEO Sue Gardner, I believe), that vandalism (especially of biographies of living people, BLP) is an increasing problem. On the other hand, we have the habit of always asking for proofs and measurements: Citation needed! We can try to find out which edits are reverts, assuming that the previous edit was an act of vandalism. That way we can conclude which articles were vandalized and how long it took to revert them. Add to that: How many readers viewed the vandalized version? Vandalism is harmless if nobody watches it. It is mostly harmless if it is obvious and childish (e.g. Barack Obama was born on Mars, he's a space alien). When it does harm (and becomes a problem, allegedly an increasing problem) is when it is viewed and taken for the truth (e.g. a statement that Barack Obama was not born in the U.S. and thus would not be a legitimate president). Especially, it becomes a very real problem if the biographed living person takes offense and takes legal action against the WMF. Now, that's very easy to measure: How much money did WMF need to spend, month by month, to resolve such conflicts, including time to explain the process to media? That is money that could be used to buy servers instead. A more efficient BLP policy might render the WMF more money for servers. Very real. Now, we only need to insert real numbers into this equation. -- Lars Aronsson (l...@aronsson.se) Aronsson Datateknik - http://aronsson.se ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: [Foundation-l] Frequency of Seeing Bad Versions - now with traffic data
On Fri, Aug 28, 2009 at 3:44 PM, Lars Aronsson l...@aronsson.se wrote: We can try to find out which edits are reverts, assuming that the previous edit was an act of vandalism. But that's a bad assumption. It gives both false positives and false negatives, and it gives a significant number of each. I gave examples of each above. My samples were tiny, but 38% of reverts were not reverts of vandalism, and 40% of vandalism was not reverted by a means detected by this strategy. And there is no reason to believe that the error is consistent over time, so these numbers are useless when it comes to determining whether or not the problem is increasing. That way we can conclude which articles were vandalized and how long it took to revert them. Your simplistic version of assuming that the previous edit was an act of vandalism makes the conclusion of how long it took to revert pretty obviously flawed, doesn't it? In your simplistic assumption (which is even worse than the one used by Robert), you're simply measuring the average time between edits. Any acts of vandalism which take more than one edit to find and fix are excluded. Now Robert's methodology wasn't quite that bad. It allowed for reverts separated by one or more other edits. But it had no way to detect an act of vandalism which lasted for hundreds of edits, was discovered by someone reading the text, and was removed without reference to the original edit with an edit summary such as Barrack Obama was born in Hawaii. And these acts of vandalism are the worst. They last the longest, they do the most harm when they are read, they get the most views, etc. Any methodology which excludes them is systemically biased. ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: [Foundation-l] Frequency of Seeing Bad Versions - now with traffic data
very interesting research - many thanks for sharing that. - Robert Rohde raro...@gmail.com wrote: From: Robert Rohde raro...@gmail.com To: Wikimedia Foundation Mailing List foundation-l@lists.wikimedia.org Sent: Thursday, 27 August, 2009 17:41:29 GMT +00:00 GMT Britain, Ireland, Portugal Subject: [Foundation-l] Frequency of Seeing Bad Versions - now with traffic data Recently, I reported on a simple study of how likely one was to encounter recent vandalism in Wikipedia based on selecting articles at random and using revert behavior as a proxy for recent vandalism. http://lists.wikimedia.org/pipermail/foundation-l/2009-August/054171.html One of the key limitations of that work was that it was looking at articles selected at random from the pool of all existing page titles. That approach was of the most immediate interest to me, but it didn't directly address the likelihood of encountering vandalism based on the way that Wikipedia is actually used because the selection of articles that people choose to visit is highly non-random. I've now redone that analysis with a crude traffic based weighting. For traffic information I used the same data stream used by http://stats.grok.se. That data is recorded hourly. For simplicity I chose 20 hours at random from the last eight months and averaged those together to get a rough picture of the relative prominence of pages. I then chose a selection of 3 articles at random with their probability of selection proportional to the traffic they received, and repeated the prior analysis previously described. (Note that this has the effect of treating the prominence of each page as a constant over time. In practice we know some pages rise to prominence while other fall down, but I am assuming the average pattern is still a good enough approximation to be useful.) From this sample I found 5,955,236 revert events in 38,096,653 edits. This is an increase of 29 times in edit frequency and 58 times the number of revert events that were found from a uniform sampling of pages. I suspect it surprises no one that highly trafficked pages are edited more often and subject to more vandalism than the average page, though it might not have been obvious that the the ratio of reverts to normal edits is also increased over more obscure pages. As before, the revert time distribution has a very long tail, though as predicted the times are generally reduced when traffic weighting is applied. In the traffic weighted sample, the median time to revert is 3.4 minutes and the mean time is 2.2 hours (compared to 6.7 minutes and 18.2 hours with uniform weighting). Again, I think it is worth acknowledging that having a majority of reverts occur within only a few minutes is a strong testament to the efficiency and dedication with which new edits are usually reviewed by the community. We could be much worse off if most things weren't caught so quickly. Unfortunately, in comparing the current analysis to the previous one, the faster response time is essentially being overwhelmed by the much larger number of vandalism occurrences. The net result is that averaged over the whole history of Wikipedia a visitor would be expected to receive a recently degraded article version during about 1.1% of requests (compared to ~0.37% in the uniform weighting estimate). The last six months averaged a slightly higher 1.3% (1 in 80 requests). As before, most of the degraded content that people are likely to actually encounter is coming from the subset of things that get by the initial monitors and survive for a long time. Among edits that are eventually reverted the longest lasting 5% of bad content (those edits taking 7.2 hours to revert) is responsible for 78% of the expected encounters with recently degraded material. One might speculate that such long-lived material is more likely to reflect subtle damage to a page rather than more obvious problems like page blanking. I did not try to investigate this. In my sample, the number of reverts being made to articles has declined ~40% since a peak in late 2006. However, the mean and median time to revert is little changed over the last two years. What little trend exists points in the direction of slightly slower responses. So to summarize, the results here are qualitatively similar to those found in the previous work. However with traffic weighting we find quantitative differences such that reverts occur much more often but take less time to be executed. The net effect of these competing factors is such that the bad content is more likely to be seen than suggested by the uniform weighting. -Robert Rohde ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: [Foundation-l] Frequency of Seeing Bad Versions - now with traffic data
I've just read two different news stories on Flagged Revisions that described vandalism as a growing problem for Wikipedia. With that in mind, I would like to highlight one specific point in the analysis I just did. The frequency of reverts to articles -- as a fraction of total edits -- has remained virtually constant for almost three years now. There is no evidence that the community is making reverts more often today (relative to total edits) than we were in 2007. Hence, I would suggest that describing vandalism as a growing problem is probably erroneous with respect to actual editing behaviors. Maybe our concern for ensuring accuracy and addressing vandalism has grown, but the scale of the underlying problem of incoming vandalism appears to be more or less constant. -Robert Rohde ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: [Foundation-l] Frequency of Seeing Bad Versions - now with traffic data
1:00 edit1:02 revert 1:06 revert 1:14 revert 1:30 revert 2:02 revert How many instances of vandalism does your program count there, and what is the mean and median time to revert? ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: [Foundation-l] Frequency of Seeing Bad Versions - now with traffic data
On Thu, Aug 27, 2009 at 2:40 PM, Robert Rohde raro...@gmail.com wrote: I've just read two different news stories on Flagged Revisions that described vandalism as a growing problem for Wikipedia. With that in mind, I would like to highlight one specific point in the analysis I just did. The frequency of reverts to articles -- as a fraction of total edits -- has remained virtually constant for almost three years now. There is no evidence that the community is making reverts more often today (relative to total edits) than we were in 2007. Hence, I would suggest that describing vandalism as a growing problem is probably erroneous with respect to actual editing behaviors. Maybe our concern for ensuring accuracy and addressing vandalism has grown, but the scale of the underlying problem of incoming vandalism appears to be more or less constant. Why do you assume that number of reverts has any correlation with amount of vandalism? Has this been studied? ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: [Foundation-l] Frequency of Seeing Bad Versions - now with traffic data
2009/8/27 Anthony wikim...@inbox.org: Why do you assume that number of reverts has any correlation with amount of vandalism? Has this been studied? It seems to be a sensible assumption, although checking it would be wise. I would put money on a significant majority of reverts being reverts of vandalism rather than BRD reverts, it may not be an overwhelming majority, though. ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: [Foundation-l] Frequency of Seeing Bad Versions - now with traffic data
On Thu, Aug 27, 2009 at 2:50 PM, Thomas Dalton thomas.dal...@gmail.comwrote: 2009/8/27 Anthony wikim...@inbox.org: Why do you assume that number of reverts has any correlation with amount of vandalism? Has this been studied? It seems to be a sensible assumption, although checking it would be wise. It seems to me to be begging the question. You don't answer the question how bad is vandalism by assuming that vandalism is generally reverted. I would put money on a significant majority of reverts being reverts of vandalism rather than BRD reverts, it may not be an overwhelming majority, though. I don't know about that, though I won't take the other end of the bet. Have you done much editing while not logged in? If so, I think you have to admit that it's quite common to find yourself reverted for things which are not properly classified as vandalism. However, that's only one half of the equation. The other half is how many instances of vandalism are not reverted, and how many are not reverted in a way that is detected by this program. ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: [Foundation-l] Frequency of Seeing Bad Versions - now with traffic data
On Thu, Aug 27, 2009 at 2:58 PM, Anthony wikim...@inbox.org wrote: On Thu, Aug 27, 2009 at 2:50 PM, Thomas Dalton thomas.dal...@gmail.comwrote: I would put money on a significant majority of reverts being reverts of vandalism rather than BRD reverts, it may not be an overwhelming majority, though. I don't know about that, though I won't take the other end of the bet. Have you done much editing while not logged in? If so, I think you have to admit that it's quite common to find yourself reverted for things which are not properly classified as vandalism. Just going through recent changes looking for rv (which is not the only thing detected by Robert's software, and is probably the most likely to be actual vandalism)... http://en.wikipedia.org/w/index.php?title=Smallpoxcurid=16829895diff=310413006oldid=310405829 (content dispute) http://en.wikipedia.org/w/index.php?title=View_Askewniversecurid=2163851diff=310412615oldid=310412247 (blanking vandalism) http://en.wikipedia.org/w/index.php?title=Barbecuecurid=37135diff=310412401oldid=310410035 (spelling dispute) http://en.wikipedia.org/w/index.php?title=Sino-American_relationscurid=277880diff=310412381oldid=310329859 (revert of POV edits, I guess that counts as vandalism by Robert's definition) http://en.wikipedia.org/w/index.php?title=Secessioncurid=144732diff=310412005oldid=310406662 (I have no idea, I guess this one qualifies) http://en.wikipedia.org/w/index.php?title=The_Underdog_Projectcurid=1436277diff=310412002oldid=308833810 (test edit, qualifies) http://en.wikipedia.org/w/index.php?title=Visual_communicationcurid=669120diff=310411952oldid=310411398 (I'm going to call this a content dispute though you may disagree) http://en.wikipedia.org/w/index.php?title=Technical_communicationcurid=1219401diff=310411937oldid=310410621 (ditto) http://en.wikipedia.org/w/index.php?title=Caroline_Ahernecurid=514223diff=310411860oldid=310328710 (removal of POV, qualifies) http://en.wikipedia.org/w/index.php?title=Mario_Kart_Wiicurid=12205924diff=310411680oldid=310401913 (vandalism, I think) http://en.wikipedia.org/w/index.php?title=Hephaestuscurid=14388diff=310411384oldid=310396007 (vandalism) http://en.wikipedia.org/w/index.php?title=List_of_pop_punk_bandscurid=4770362diff=310410857oldid=310410740 (looks like a content dispute) http://en.wikipedia.org/w/index.php?title=Korn's_ninth_studio_albumcurid=21855821diff=310410677oldid=310381982 (content dispute) http://en.wikipedia.org/w/index.php?title=Kinetic_energycurid=17327diff=310410573oldid=310391734 (vandalism) http://en.wikipedia.org/w/index.php?title=List_of_best-selling_Wii_video_gamescurid=21469202diff=310410431oldid=310395902 (seems to be reversion of a legitimate edit) http://en.wikipedia.org/w/index.php?title=Teleological_argumentcurid=30731diff=310410174oldid=310399980 (content dispute) http://en.wikipedia.org/w/index.php?title=Nick_Swardsoncurid=3630190diff=310410089oldid=310410013 (vandalism) http://en.wikipedia.org/w/index.php?title=Jose_Cansecocurid=175552diff=310409931oldid=310408069 (vandalism, I guess) http://en.wikipedia.org/w/index.php?title=Ola_Moumcurid=8083232diff=310409846oldid=310396138 (content dispute) http://en.wikipedia.org/w/index.php?title=Kareli,_Georgiacurid=18661674diff=310409393oldid=310348062 (vandalism, I think) http://en.wikipedia.org/w/index.php?title=Victoria_Justicecurid=2662543diff=310412751oldid=310411603 (I guess it's technically a BLP violation, so qualifies) 13/21=62% actual vandalism, though I'm sure 80 people will now proceed to dispute my categorizations. Robert, let's get a random sample of the actual reverts your program found... ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: [Foundation-l] Frequency of Seeing Bad Versions - now with traffic data
On Thu, Aug 27, 2009 at 3:33 PM, Anthonywikim...@inbox.org wrote: On Thu, Aug 27, 2009 at 2:58 PM, Anthony wikim...@inbox.org wrote: On Thu, Aug 27, 2009 at 2:50 PM, Thomas Dalton thomas.dal...@gmail.comwrote: I would put money on a significant majority of reverts being reverts of vandalism rather than BRD reverts, it may not be an overwhelming majority, though. I don't know about that, though I won't take the other end of the bet. Have you done much editing while not logged in? If so, I think you have to admit that it's quite common to find yourself reverted for things which are not properly classified as vandalism. Just going through recent changes looking for rv (which is not the only thing detected by Robert's software, and is probably the most likely to be actual vandalism)... http://en.wikipedia.org/w/index.php?title=Smallpoxcurid=16829895diff=310413006oldid=310405829 (content dispute) http://en.wikipedia.org/w/index.php?title=View_Askewniversecurid=2163851diff=310412615oldid=310412247 (blanking vandalism) http://en.wikipedia.org/w/index.php?title=Barbecuecurid=37135diff=310412401oldid=310410035 (spelling dispute) http://en.wikipedia.org/w/index.php?title=Sino-American_relationscurid=277880diff=310412381oldid=310329859 (revert of POV edits, I guess that counts as vandalism by Robert's definition) http://en.wikipedia.org/w/index.php?title=Secessioncurid=144732diff=310412005oldid=310406662 (I have no idea, I guess this one qualifies) http://en.wikipedia.org/w/index.php?title=The_Underdog_Projectcurid=1436277diff=310412002oldid=308833810 (test edit, qualifies) http://en.wikipedia.org/w/index.php?title=Visual_communicationcurid=669120diff=310411952oldid=310411398 (I'm going to call this a content dispute though you may disagree) http://en.wikipedia.org/w/index.php?title=Technical_communicationcurid=1219401diff=310411937oldid=310410621 (ditto) http://en.wikipedia.org/w/index.php?title=Caroline_Ahernecurid=514223diff=310411860oldid=310328710 (removal of POV, qualifies) http://en.wikipedia.org/w/index.php?title=Mario_Kart_Wiicurid=12205924diff=310411680oldid=310401913 (vandalism, I think) http://en.wikipedia.org/w/index.php?title=Hephaestuscurid=14388diff=310411384oldid=310396007 (vandalism) http://en.wikipedia.org/w/index.php?title=List_of_pop_punk_bandscurid=4770362diff=310410857oldid=310410740 (looks like a content dispute) http://en.wikipedia.org/w/index.php?title=Korn's_ninth_studio_albumcurid=21855821diff=310410677oldid=310381982 (content dispute) http://en.wikipedia.org/w/index.php?title=Kinetic_energycurid=17327diff=310410573oldid=310391734 (vandalism) http://en.wikipedia.org/w/index.php?title=List_of_best-selling_Wii_video_gamescurid=21469202diff=310410431oldid=310395902 (seems to be reversion of a legitimate edit) http://en.wikipedia.org/w/index.php?title=Teleological_argumentcurid=30731diff=310410174oldid=310399980 (content dispute) http://en.wikipedia.org/w/index.php?title=Nick_Swardsoncurid=3630190diff=310410089oldid=310410013 (vandalism) http://en.wikipedia.org/w/index.php?title=Jose_Cansecocurid=175552diff=310409931oldid=310408069 (vandalism, I guess) http://en.wikipedia.org/w/index.php?title=Ola_Moumcurid=8083232diff=310409846oldid=310396138 (content dispute) http://en.wikipedia.org/w/index.php?title=Kareli,_Georgiacurid=18661674diff=310409393oldid=310348062 (vandalism, I think) http://en.wikipedia.org/w/index.php?title=Victoria_Justicecurid=2662543diff=310412751oldid=310411603 (I guess it's technically a BLP violation, so qualifies) 13/21=62% actual vandalism, though I'm sure 80 people will now proceed to dispute my categorizations. Robert, let's get a random sample of the actual reverts your program found... ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l /rvv?|revert(ing)?[ ]*(vandal(ism)?)?/ Might give you a slightly wider sample. -Chad ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: [Foundation-l] Frequency of Seeing Bad Versions - now with traffic data
Anthony wrote: On Thu, Aug 27, 2009 at 2:58 PM, Anthony wikim...@inbox.org wrote: On Thu, Aug 27, 2009 at 2:50 PM, Thomas Dalton thomas.dal...@gmail.comwrote: I would put money on a significant majority of reverts being reverts of vandalism rather than BRD reverts, it may not be an overwhelming majority, though. I don't know about that, though I won't take the other end of the bet. Have you done much editing while not logged in? If so, I think you have to admit that it's quite common to find yourself reverted for things which are not properly classified as vandalism. Just going through recent changes looking for rv (which is not the only thing detected by Robert's software, and is probably the most likely to be actual vandalism)... Most vandalism reversion on enwiki (I believe) is done with automated tools and/or rollback rather than manual reversion. They typically leave more detailed summaries: Reverted N edits by X identified as vandalism to last revision by Y Reverted edits by X (talk) to last version by Y -- Alex (wikipedia:en:User:Mr.Z-man) ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: [Foundation-l] Frequency of Seeing Bad Versions - now with traffic data
On Thu, Aug 27, 2009 at 3:45 PM, Chad innocentkil...@gmail.com wrote: /rvv?|revert(ing)?[ ]*(vandal(ism)?)?/ Might give you a slightly wider sample. I'll wait for Robert to release a random sample of edits he actually identified as reverts and/or the actual scripts and data dump he used. ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: [Foundation-l] Frequency of Seeing Bad Versions - now with traffic data
On Fri, Aug 28, 2009 at 4:58 AM, Anthonywikim...@inbox.org wrote: It seems to me to be begging the question. You don't answer the question how bad is vandalism by assuming that vandalism is generally reverted. Can you suggest a better metric then? -- Stephen Bain stephen.b...@gmail.com ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: [Foundation-l] Frequency of Seeing Bad Versions - now with traffic data
On Thu, Aug 27, 2009 at 7:58 PM, Stephen Bain stephen.b...@gmail.comwrote: On Fri, Aug 28, 2009 at 4:58 AM, Anthonywikim...@inbox.org wrote: It seems to me to be begging the question. You don't answer the question how bad is vandalism by assuming that vandalism is generally reverted. Can you suggest a better metric then? I must admit I don't understand the question. ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: [Foundation-l] Frequency of Seeing Bad Versions - now with traffic data
2009/8/28 Anthony wikim...@inbox.org: On Thu, Aug 27, 2009 at 7:58 PM, Stephen Bain stephen.b...@gmail.comwrote: On Fri, Aug 28, 2009 at 4:58 AM, Anthonywikim...@inbox.org wrote: It seems to me to be begging the question. You don't answer the question how bad is vandalism by assuming that vandalism is generally reverted. Can you suggest a better metric then? I must admit I don't understand the question. He means what would you measure in order to draw conclusions about the severity of vandalism. ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: [Foundation-l] Frequency of Seeing Bad Versions - now with traffic data
On Thu, Aug 27, 2009 at 8:24 PM, Thomas Dalton thomas.dal...@gmail.comwrote: 2009/8/28 Anthony wikim...@inbox.org: On Thu, Aug 27, 2009 at 7:58 PM, Stephen Bain stephen.b...@gmail.com wrote: On Fri, Aug 28, 2009 at 4:58 AM, Anthonywikim...@inbox.org wrote: It seems to me to be begging the question. You don't answer the question how bad is vandalism by assuming that vandalism is generally reverted. Can you suggest a better metric then? I must admit I don't understand the question. He means what would you measure in order to draw conclusions about the severity of vandalism. Umm...you would count the number of instances of vandalism? Is the question how to objectively *define* vandalism? If not, I still don't understand the question. ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: [Foundation-l] Frequency of Seeing Bad Versions - now with traffic data
On Thu, Aug 27, 2009 at 8:24 PM, Thomas Daltonthomas.dal...@gmail.com wrote: 2009/8/28 Anthony wikim...@inbox.org: On Thu, Aug 27, 2009 at 7:58 PM, Stephen Bain stephen.b...@gmail.comwrote: On Fri, Aug 28, 2009 at 4:58 AM, Anthonywikim...@inbox.org wrote: It seems to me to be begging the question. You don't answer the question how bad is vandalism by assuming that vandalism is generally reverted. Can you suggest a better metric then? I must admit I don't understand the question. He means what would you measure in order to draw conclusions about the severity of vandalism. The obvious methodology would be to take a large random sample and hand classify it. It's not rocket science. By having multiple people perform the classification you could measure the confidence of the classification. This is somewhat labor intensive, but only somewhat as it doesn't take an inordinate number of samples to produce representative results. This should be the gold standard for this kind of measurement as it would be much closer to what people actually want to know than most machine metrics. If the results of this kind of study have good agreement with mechanical proxy metrics (such as machine detected vandalism) our confidence in those proxies will increase, if they disagree it will provide an opportunity to improve the proxies. These are techniques widely used in other fields. ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: [Foundation-l] Frequency of Seeing Bad Versions - now with traffic data
2009/8/28 Anthony wikim...@inbox.org: He means what would you measure in order to draw conclusions about the severity of vandalism. Umm...you would count the number of instances of vandalism? That's not practical. That would require a person to go through article histories revision by revision, probably multiple people per article to check they agreed on what was vandalism. It won't work for the kind of sample sizes required unless you get an army of volunteers. We need something that we expect to strongly correlate with the number of instances of vandalism but is easier to measure - that is what revisions with revert/rvv/etc. in the edit summary was intended to be. ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: [Foundation-l] Frequency of Seeing Bad Versions - now with traffic data
2009/8/28 Gregory Maxwell gmaxw...@gmail.com: This is somewhat labor intensive, but only somewhat as it doesn't take an inordinate number of samples to produce representative results. This should be the gold standard for this kind of measurement as it would be much closer to what people actually want to know than most machine metrics. To get a fair sample we would need to include some highly active pages. They have ridiculous numbers of revisions (even if you restrict it to the last few months). If the results of this kind of study have good agreement with mechanical proxy metrics (such as machine detected vandalism) our confidence in those proxies will increase, if they disagree it will provide an opportunity to improve the proxies. This kind of intensive study on a few small sample with a more automated method used on the same sample to compare would be more achievable. If the automated method gets similar results, we can use that method for larger samples. ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: [Foundation-l] Frequency of Seeing Bad Versions - now with traffic data
On Thu, Aug 27, 2009 at 8:36 PM, Thomas Dalton thomas.dal...@gmail.comwrote: 2009/8/28 Anthony wikim...@inbox.org: He means what would you measure in order to draw conclusions about the severity of vandalism. Umm...you would count the number of instances of vandalism? That's not practical. I never said it was practical, I just said that counting revisions and calling that counting vandalism is incorrect. ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: [Foundation-l] Frequency of Seeing Bad Versions - now with traffic data
2009/8/28 Anthony wikim...@inbox.org: On Thu, Aug 27, 2009 at 8:36 PM, Thomas Dalton thomas.dal...@gmail.comwrote: 2009/8/28 Anthony wikim...@inbox.org: He means what would you measure in order to draw conclusions about the severity of vandalism. Umm...you would count the number of instances of vandalism? That's not practical. I never said it was practical, I just said that counting revisions and calling that counting vandalism is incorrect. And you were asked to suggest a better approach. Nobody claimed it was perfect. ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: [Foundation-l] Frequency of Seeing Bad Versions - now with traffic data
On Thu, Aug 27, 2009 at 8:41 PM, Thomas Dalton thomas.dal...@gmail.comwrote: 2009/8/28 Anthony wikim...@inbox.org: On Thu, Aug 27, 2009 at 8:36 PM, Thomas Dalton thomas.dal...@gmail.com wrote: 2009/8/28 Anthony wikim...@inbox.org: He means what would you measure in order to draw conclusions about the severity of vandalism. Umm...you would count the number of instances of vandalism? That's not practical. I never said it was practical, I just said that counting revisions and calling that counting vandalism is incorrect. And you were asked to suggest a better approach. Nobody claimed it was perfect. I suggested a better approach last time we had this thread: statistical sampling. And I'm saying much more than that this method is imperfect. I'm saying it's fundamentally flawed when it comes to measuring vandalism. It measures something much different than vandalism. When it comes to answering the question of how likely is one to encounter vandalism, I am no more informed after reading this thread than before. It could be 0.5% and I wouldn't be surprised. It could be 3% and I wouldn't be surprised. The methods used in this study both undercount and overcount vandalism, possibly quite significantly. Not all reverts are reverts of vandalism. I wouldn't be surprised if only 50% of them are. And not all vandalism is reverted. As revert is defined by this method, I wouldn't be surprised if 75% of vandalism is not detected. This study doesn't measure vandalism. ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: [Foundation-l] Frequency of Seeing Bad Versions - now with traffic data
2009/8/28 Anthony wikim...@inbox.org: I suggested a better approach last time we had this thread: statistical sampling. This research was based on a sample. What are you talking about? ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: [Foundation-l] Frequency of Seeing Bad Versions - now with traffic data
Just took a quick sample of 10 instances of vandalism to [[Ted Stevens]]. Of those 10 instances of vandalism, either 2 or 4 would not have been found by the automated tool described. 2 if every edit summary containing the word vandalism is counted as vandalism, and 4 if not. The former would probably significantly overcount vandalism. http://en.wikipedia.org/w/index.php?title=Ted_Stevensdiff=173527553oldid=173381871 (Removed vandalism) http://en.wikipedia.org/w/index.php?title=Ted_Stevensdiff=180054904oldid=179982198 (rmv vandalism) http://en.wikipedia.org/w/index.php?title=Ted_Stevensdiff=168486242oldid=168438600 no edit summary http://en.wikipedia.org/w/index.php?title=Ted_Stevensdiff=162332870oldid=162038733 (yes it is funny, but this doesn't belong here) On Thu, Aug 27, 2009 at 9:31 PM, Thomas Dalton thomas.dal...@gmail.com wrote: 2009/8/28 Anthony wikim...@inbox.org: I suggested a better approach last time we had this thread: statistical sampling. This research was based on a sample. What are you talking about? I'm talking about taking a sample and examining it manually. First, spend a few weeks coming up with an objective definition of vandalism. Then pick 5,000 random article views from the http log, and publish the URL/date/time. Then advertise the list all over the place (especially on sites like Wikipedia Review) asking people to find instances of vandalism in it. People can use automated means which they then go through by hand to remove false positives, manual error checking, spot checking, whatever. The number of confirmed instances of vandalism will grow for a while, and eventually will start to level off. May not be perfect, but it'll provide a lower bound on the amount of vandalism, at least. Have a statistician tell us what our exact error bounds are. And then prepare for a second study, improving on everything (the definition of vandalism, the number of random article views, the amount of time to wait) based on what we learned. ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: [Foundation-l] Frequency of Seeing Bad Versions - now with traffic data
On Thu, Aug 27, 2009 at 9:47 PM, Anthony wikim...@inbox.org wrote: Just took a quick sample of 10 instances of vandalism to [[Ted Stevens]]. Of those 10 instances of vandalism, either 2 or 4 would not have been found by the automated tool described. 2 if every edit summary containing the word vandalism is counted as vandalism, and 4 if not. The former would probably significantly overcount vandalism. http://en.wikipedia.org/w/index.php?title=Ted_Stevensdiff=173527553oldid=173381871 (Removed vandalism) http://en.wikipedia.org/w/index.php?title=Ted_Stevensdiff=180054904oldid=179982198 (rmv vandalism) http://en.wikipedia.org/w/index.php?title=Ted_Stevensdiff=168486242oldid=168438600 no edit summary http://en.wikipedia.org/w/index.php?title=Ted_Stevensdiff=162332870oldid=162038733 (yes it is funny, but this doesn't belong here) On Thu, Aug 27, 2009 at 9:31 PM, Thomas Dalton thomas.dal...@gmail.com wrote: 2009/8/28 Anthony wikim...@inbox.org: I suggested a better approach last time we had this thread: statistical sampling. This research was based on a sample. What are you talking about? I'm talking about taking a sample and examining it manually. First, spend a few weeks coming up with an objective definition of vandalism. Then pick 5,000 random article views from the http log, and publish the URL/date/time. Then advertise the list all over the place (especially on sites like Wikipedia Review) asking people to find instances of vandalism in it. People can use automated means which they then go through by hand to remove false positives, manual error checking, spot checking, whatever. The number of confirmed instances of vandalism will grow for a while, and eventually will start to level off. May not be perfect, but it'll provide a lower bound on the amount of vandalism, at least. Have a statistician tell us what our exact error bounds are. And then prepare for a second study, improving on everything (the definition of vandalism, the number of random article views, the amount of time to wait) based on what we learned. Out of curiosity, Anthony, do you still refrain from editing Wikimedia projects over licensing issues? How long has it been, a year? Nathan ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: [Foundation-l] Frequency of Seeing Bad Versions - now with traffic data
On Thu, Aug 27, 2009 at 10:07 PM, Nathan nawr...@gmail.com wrote: Out of curiosity, Anthony, do you still refrain from editing Wikimedia projects over licensing issues? How long has it been, a year? I guess now is as good a time as any to admit it. I started editing again, without logging in, about a month ago. I made a couple dozen or so edits in since. Prior to that my last edit was October 20, 2008. ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: [Foundation-l] Frequency of Seeing Bad Versions - now with traffic data
On 8/27/09 9:39 PM, Thomas Dalton wrote: 2009/8/28 Gregory Maxwellgmaxw...@gmail.com: If the results of this kind of study have good agreement with mechanical proxy metrics (such as machine detected vandalism) our confidence in those proxies will increase, if they disagree it will provide an opportunity to improve the proxies. This kind of intensive study on a few small sample with a more automated method used on the same sample to compare would be more achievable. If the automated method gets similar results, we can use that method for larger samples. I would certainly be interested in seeing such a result. Generally speaking we can expect a strong correlation between vandalism and machine-identifiable reverts -- it's a totally reasonable assumption for a first-order approximation -- and it would be valuable to confirm this and see how much divergence there might be between this count and other markers. Most interesting following this would be take into account the effects of flagged revisions and how this could affect initially-displayed vs edited revisions. Has there been similar work targeting German-language Wikipedia already? Robert, is it possible to share the source for generating the revert-based stats with other folks who may be interested in pursuing further work on the subject? Thanks! -- brion vibber (brion @ wikimedia.org) CTO Senior Software Architect, Wikimedia Foundation ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l