On Fri, Aug 28, 2009 at 12:43 AM, Brion Vibber br...@wikimedia.org wrote:
On 8/27/09 9:39 PM, Thomas Dalton wrote:
2009/8/28 Gregory Maxwellgmaxw...@gmail.com:
If the results of this kind of study have good agreement with
mechanical proxy metrics (such as machine detected vandalism) our
On Fri, Aug 28, 2009 at 10:08 AM, Thomas Dalton thomas.dal...@gmail.comwrote:
2009/8/28 Anthony wikim...@inbox.org:
If you're going to do it, maybe we should work on a rough-consensus
objective definition of vandalism before you release the file,
though...
Don't we have a consensus
On Thu, Aug 27, 2009 at 9:43 PM, Brion Vibberbr...@wikimedia.org wrote:
snip
Robert, is it possible to share the source for generating the
revert-based stats with other folks who may be interested in pursuing
further work on the subject? Thanks!
Not as a complete stand-alone entity. The
On Fri, Aug 28, 2009 at 3:55 AM, Anthonywikim...@inbox.org wrote:
snip
Once we have the list, anyone is free to examine it any way they want, and
show their results. But we're talking about probably less than 200
instances of vandalism here, so it'll be quite easy (and fun) to lambaste
Anthony wrote:
Umm...you would count the number of instances of vandalism?
Is the question how to objectively *define* vandalism?
On one hand, we have a perception, as expressed by media (and by
CEO Sue Gardner, I believe), that vandalism (especially of
biographies of living people, BLP)
On Fri, Aug 28, 2009 at 3:44 PM, Lars Aronsson l...@aronsson.se wrote:
We can try to find out which edits are reverts, assuming that the
previous edit was an act of vandalism.
But that's a bad assumption. It gives both false positives and false
negatives, and it gives a significant number of
Recently, I reported on a simple study of how likely one was to
encounter recent vandalism in Wikipedia based on selecting articles at
random and using revert behavior as a proxy for recent vandalism.
http://lists.wikimedia.org/pipermail/foundation-l/2009-August/054171.html
One of the key
,
Portugal
Subject: [Foundation-l] Frequency of Seeing Bad Versions - now with traffic
data
Recently, I reported on a simple study of how likely one was to
encounter recent vandalism in Wikipedia based on selecting articles at
random and using revert behavior as a proxy for recent vandalism
I've just read two different news stories on Flagged Revisions that
described vandalism as a growing problem for Wikipedia.
With that in mind, I would like to highlight one specific point in the
analysis I just did.
The frequency of reverts to articles -- as a fraction of total edits
-- has
1:00 edit1:02 revert
1:06 revert
1:14 revert
1:30 revert
2:02 revert
How many instances of vandalism does your program count there, and what is
the mean and median time to revert?
___
foundation-l mailing list
foundation-l@lists.wikimedia.org
On Thu, Aug 27, 2009 at 2:40 PM, Robert Rohde raro...@gmail.com wrote:
I've just read two different news stories on Flagged Revisions that
described vandalism as a growing problem for Wikipedia.
With that in mind, I would like to highlight one specific point in the
analysis I just did.
The
2009/8/27 Anthony wikim...@inbox.org:
Why do you assume that number of reverts has any correlation with amount of
vandalism? Has this been studied?
It seems to be a sensible assumption, although checking it would be
wise. I would put money on a significant majority of reverts being
reverts of
On Thu, Aug 27, 2009 at 2:50 PM, Thomas Dalton thomas.dal...@gmail.comwrote:
2009/8/27 Anthony wikim...@inbox.org:
Why do you assume that number of reverts has any correlation with amount
of
vandalism? Has this been studied?
It seems to be a sensible assumption, although checking it
On Thu, Aug 27, 2009 at 2:58 PM, Anthony wikim...@inbox.org wrote:
On Thu, Aug 27, 2009 at 2:50 PM, Thomas Dalton thomas.dal...@gmail.comwrote:
I would put money on a significant majority of reverts being
reverts of vandalism rather than BRD reverts, it may not be an
overwhelming majority,
On Thu, Aug 27, 2009 at 3:33 PM, Anthonywikim...@inbox.org wrote:
On Thu, Aug 27, 2009 at 2:58 PM, Anthony wikim...@inbox.org wrote:
On Thu, Aug 27, 2009 at 2:50 PM, Thomas Dalton
thomas.dal...@gmail.comwrote:
I would put money on a significant majority of reverts being
reverts of
Anthony wrote:
On Thu, Aug 27, 2009 at 2:58 PM, Anthony wikim...@inbox.org wrote:
On Thu, Aug 27, 2009 at 2:50 PM, Thomas Dalton
thomas.dal...@gmail.comwrote:
I would put money on a significant majority of reverts being
reverts of vandalism rather than BRD reverts, it may not be an
On Thu, Aug 27, 2009 at 3:45 PM, Chad innocentkil...@gmail.com wrote:
/rvv?|revert(ing)?[ ]*(vandal(ism)?)?/
Might give you a slightly wider sample.
I'll wait for Robert to release a random sample of edits he actually
identified as reverts and/or the actual scripts and data dump he used.
On Fri, Aug 28, 2009 at 4:58 AM, Anthonywikim...@inbox.org wrote:
It seems to me to be begging the question. You don't answer the question
how bad is vandalism by assuming that vandalism is generally reverted.
Can you suggest a better metric then?
--
Stephen Bain
stephen.b...@gmail.com
On Thu, Aug 27, 2009 at 7:58 PM, Stephen Bain stephen.b...@gmail.comwrote:
On Fri, Aug 28, 2009 at 4:58 AM, Anthonywikim...@inbox.org wrote:
It seems to me to be begging the question. You don't answer the question
how bad is vandalism by assuming that vandalism is generally reverted.
2009/8/28 Anthony wikim...@inbox.org:
On Thu, Aug 27, 2009 at 7:58 PM, Stephen Bain stephen.b...@gmail.comwrote:
On Fri, Aug 28, 2009 at 4:58 AM, Anthonywikim...@inbox.org wrote:
It seems to me to be begging the question. You don't answer the question
how bad is vandalism by assuming
On Thu, Aug 27, 2009 at 8:24 PM, Thomas Dalton thomas.dal...@gmail.comwrote:
2009/8/28 Anthony wikim...@inbox.org:
On Thu, Aug 27, 2009 at 7:58 PM, Stephen Bain stephen.b...@gmail.com
wrote:
On Fri, Aug 28, 2009 at 4:58 AM, Anthonywikim...@inbox.org wrote:
It seems to me to be
On Thu, Aug 27, 2009 at 8:24 PM, Thomas Daltonthomas.dal...@gmail.com wrote:
2009/8/28 Anthony wikim...@inbox.org:
On Thu, Aug 27, 2009 at 7:58 PM, Stephen Bain stephen.b...@gmail.comwrote:
On Fri, Aug 28, 2009 at 4:58 AM, Anthonywikim...@inbox.org wrote:
It seems to me to be begging the
2009/8/28 Anthony wikim...@inbox.org:
He means what would you measure in order to draw conclusions about the
severity of vandalism.
Umm...you would count the number of instances of vandalism?
That's not practical. That would require a person to go through
article histories revision by
2009/8/28 Gregory Maxwell gmaxw...@gmail.com:
This is somewhat labor intensive, but only somewhat as it doesn't take
an inordinate number of samples to produce representative results.
This should be the gold standard for this kind of measurement as it
would be much closer to what people
On Thu, Aug 27, 2009 at 8:36 PM, Thomas Dalton thomas.dal...@gmail.comwrote:
2009/8/28 Anthony wikim...@inbox.org:
He means what would you measure in order to draw conclusions about the
severity of vandalism.
Umm...you would count the number of instances of vandalism?
That's not
2009/8/28 Anthony wikim...@inbox.org:
On Thu, Aug 27, 2009 at 8:36 PM, Thomas Dalton thomas.dal...@gmail.comwrote:
2009/8/28 Anthony wikim...@inbox.org:
He means what would you measure in order to draw conclusions about the
severity of vandalism.
Umm...you would count the number of
On Thu, Aug 27, 2009 at 8:41 PM, Thomas Dalton thomas.dal...@gmail.comwrote:
2009/8/28 Anthony wikim...@inbox.org:
On Thu, Aug 27, 2009 at 8:36 PM, Thomas Dalton thomas.dal...@gmail.com
wrote:
2009/8/28 Anthony wikim...@inbox.org:
He means what would you measure in order to draw
2009/8/28 Anthony wikim...@inbox.org:
I suggested a better approach last time we had this thread: statistical
sampling.
This research was based on a sample. What are you talking about?
___
foundation-l mailing list
foundation-l@lists.wikimedia.org
Just took a quick sample of 10 instances of vandalism to [[Ted Stevens]].
Of those 10 instances of vandalism, either 2 or 4 would not have been found
by the automated tool described. 2 if every edit summary containing the
word vandalism is counted as vandalism, and 4 if not. The former would
On Thu, Aug 27, 2009 at 9:47 PM, Anthony wikim...@inbox.org wrote:
Just took a quick sample of 10 instances of vandalism to [[Ted Stevens]].
Of those 10 instances of vandalism, either 2 or 4 would not have been
found
by the automated tool described. 2 if every edit summary containing the
On Thu, Aug 27, 2009 at 10:07 PM, Nathan nawr...@gmail.com wrote:
Out of curiosity, Anthony, do you still refrain from editing Wikimedia
projects over licensing
issues? How long has it been, a year?
I guess now is as good a time as any to admit it. I started editing again,
without logging
On 8/27/09 9:39 PM, Thomas Dalton wrote:
2009/8/28 Gregory Maxwellgmaxw...@gmail.com:
If the results of this kind of study have good agreement with
mechanical proxy metrics (such as machine detected vandalism) our
confidence in those proxies will increase, if they disagree it will
provide an
32 matches
Mail list logo