Re: [WikiEN-l] Unreverted vandalism

2010-02-11 Thread Andrew Gray
On 11 February 2010 17:17, Carcharoth  wrote:
> On Thu, Feb 11, 2010 at 5:05 PM, Andrew Gray  
> wrote:
>
>> b) Use reversions. Sample a thousand uses of rollback from the recent
>> changes list, find time between that edit and the one it was
>> reverting.
>
> That one sounds easier. If only people wouldn't use rollback 
> inappropriately...

Mmm. You'd want a second study to get an estimation of how much rollback is:

a) inappropriate - edit-warring;
b) irrelevant ("rollback self" is not unknown...);
c) legitimate but mundane, such as mass-reverting edits to clean up
after a discussion;

and finally d) actually reverting vandalism.

(The same applies to (undo), but the proportion of d) would of course
be vastly lower)

-- 
- Andrew Gray
  andrew.g...@dunelm.org.uk

___
WikiEN-l mailing list
WikiEN-l@lists.wikimedia.org
To unsubscribe from this mailing list, visit:
https://lists.wikimedia.org/mailman/listinfo/wikien-l


Re: [WikiEN-l] Unreverted vandalism

2010-02-11 Thread Carcharoth
On Thu, Feb 11, 2010 at 5:46 PM, Gwern Branwen  wrote:



> it was only when I gave it a last try all the way back to December,
> that I figured it out: 2 entire substantial sections had gotten
> deleted.

Goodness. That reminds me of the problem there used to be with
unclosed ref tags leading to articles truncating on the screen at the
point the closing ref tag was missing. The text was all there, just
not displaying. I think that got fixed when someone tweaked Mediawiki
to jump and down and produce flashing red warning lights when this
happens.

Something like removal of entire section can be picked up by edit
filters, but you still need people to check the filters and decide
which edits are good and which are bad. I had an edit filter set up to
detect the removal of "Category:Living people" from articles, but
stopped following it after a few weeks when I realised that most of
the edits were being reverted for other reasons before I had a chance
to check (some way was needed to *flag* which edits had been dealt
with or not).

Funny that, you know, flagging of edits. I first encountered a form of
that on wikisource. Some form of flagged revisions might happen on
en-Wikipedia some day as well, but it is quite a culture change to get
used to. Hopefully when it happens, people will adapt quickly (ditto
for LiquidThreads).

In fact, that has been my major worry about both Flagged Revisions and
LiquidThreads. Will people get turned off by the new user interfaces
if they don't like them? How do you implement major changes like this
without breaking parts of what currently exist?

Carcharoth

___
WikiEN-l mailing list
WikiEN-l@lists.wikimedia.org
To unsubscribe from this mailing list, visit:
https://lists.wikimedia.org/mailman/listinfo/wikien-l


Re: [WikiEN-l] Images that are PD in their country of origin

2010-02-11 Thread Ken Arromdee
On Thu, 11 Feb 2010, SlimVirgin wrote:
> Imagine that one of those victims were here now, part of this discussion.
> Please explain to him why we can't develop image policies that avoid that
> outcome.

If one of the victims was here now, and he took the picture, he could grant
a free license and we could use it.

Anyway, I don't think there's as much a separation between this and our
fair use image policy as you seem to think (not in this message, in previous
ones).  It's the *very same* attitude that creates a ridiculous fair use
policy which also creates a ridiculous no-proof-it's-PD policy.

___
WikiEN-l mailing list
WikiEN-l@lists.wikimedia.org
To unsubscribe from this mailing list, visit:
https://lists.wikimedia.org/mailman/listinfo/wikien-l


Re: [WikiEN-l] Images that are PD in their country of origin

2010-02-11 Thread SlimVirgin
On Thu, Feb 11, 2010 at 04:34, Carcharoth wrote:

>
> I agree that the "educational content" and "free license or in the
> public domain" aspects do often conflict, but both aspects need to be
> borne in mind when debating such cases.
>
> Right, but both aspects aren't borne in mind, that's the problem. The free
eclipses the educational. The educational often has to sneak in the back
door under the guise of fair use, reduced in size and quality, argued over
endlessly and depressingly year after year, just because it doesn't fit one
of our standard U.S.-based free-licence tags.

Please ponder on the grotesque absurdity of a project designed to empower,
collect, develop and disseminate etc having policies in place that threaten
Holocaust images with deletion, images that were taken by victims at
enormous personal risk precisely to tell the world what was happening.

Imagine that one of those victims were here now, part of this discussion.
Please explain to him why we can't develop image policies that avoid that
outcome.

Sarah
___
WikiEN-l mailing list
WikiEN-l@lists.wikimedia.org
To unsubscribe from this mailing list, visit:
https://lists.wikimedia.org/mailman/listinfo/wikien-l


Re: [WikiEN-l] Unreverted vandalism

2010-02-11 Thread Gwern Branwen
On Thu, Feb 11, 2010 at 12:21 PM, Thomas Dalton  wrote:
> On 11 February 2010 17:17, Carcharoth  wrote:
>> On Thu, Feb 11, 2010 at 5:05 PM, Andrew Gray  
>> wrote:
>>
>>> b) Use reversions. Sample a thousand uses of rollback from the recent
>>> changes list, find time between that edit and the one it was
>>> reverting.
>>
>> That one sounds easier. If only people wouldn't use rollback 
>> inappropriately...
>
> Looking for rollback edits is a good way to find vandalism that was
> reverted quickly, but as Andrew says it won't find old vandalism on
> articles with subsequent edits, which is essential if the intention it
> to find out how much vandalism takes a long time to be reverted.

And such are very common. In high-vandalism pages, it's easy for
entire sections to just drop out in the back and forth. Bot edits
badly exacerbate the issue because they edit whenever the heck they
feel like it, and increase the noise in diffs.

An example: while looking at a reversion of a few anon edits on
[[Legalism (Chinese philosophy)]], I grew suspicious of the ordering
of sections - it seemed a little off, a little too choppy. I looked at
consolidated diffs back to January, finding nothing in particular, but
it was only when I gave it a last try all the way back to December,
that I figured it out: 2 entire substantial sections had gotten
deleted.

I had to manually copy them back in because of all the bot activity in
the interim: 
https://secure.wikimedia.org/wikipedia/en/w/index.php?title=Legalism_%28Chinese_philosophy%29&action=historysubmit&diff=342300729&oldid=341823153

-- 
gwern

___
WikiEN-l mailing list
WikiEN-l@lists.wikimedia.org
To unsubscribe from this mailing list, visit:
https://lists.wikimedia.org/mailman/listinfo/wikien-l


Re: [WikiEN-l] Unreverted vandalism

2010-02-11 Thread Carcharoth
On Thu, Feb 11, 2010 at 5:05 PM, Andrew Gray  wrote:

> Any other ideas?

One more: number of page views while in a vandalised state when that
state is over a longer period (minimum for view stats would be to be
in a vandalised state for a whole calendar day - better to look at
article that were in a vandalised state for weeks or months).

The article I pointed to was in that state for nearly 3 weeks, and the
vandalism was visible in the first word you try and read (unless
people saw the template notice on stop and stopped reading):

http://en.wikipedia.org/w/index.php?title=Military_science&oldid=339309229

>From 23 January to 10 February, the page views were about 200 a day,
though the page view counter seems to have been broken for three of
those days.

http://stats.grok.se/en/201002/Military_science

Still, that is 19 days and about 3800 page views where the vandalism
was not detected. Strange.

Carcharoth

___
WikiEN-l mailing list
WikiEN-l@lists.wikimedia.org
To unsubscribe from this mailing list, visit:
https://lists.wikimedia.org/mailman/listinfo/wikien-l


Re: [WikiEN-l] Unreverted vandalism

2010-02-11 Thread Thomas Dalton
On 11 February 2010 17:17, Carcharoth  wrote:
> On Thu, Feb 11, 2010 at 5:05 PM, Andrew Gray  
> wrote:
>
>> b) Use reversions. Sample a thousand uses of rollback from the recent
>> changes list, find time between that edit and the one it was
>> reverting.
>
> That one sounds easier. If only people wouldn't use rollback 
> inappropriately...

Looking for rollback edits is a good way to find vandalism that was
reverted quickly, but as Andrew says it won't find old vandalism on
articles with subsequent edits, which is essential if the intention it
to find out how much vandalism takes a long time to be reverted.

___
WikiEN-l mailing list
WikiEN-l@lists.wikimedia.org
To unsubscribe from this mailing list, visit:
https://lists.wikimedia.org/mailman/listinfo/wikien-l


Re: [WikiEN-l] Unreverted vandalism

2010-02-11 Thread Carcharoth
On Thu, Feb 11, 2010 at 5:05 PM, Andrew Gray  wrote:

> b) Use reversions. Sample a thousand uses of rollback from the recent
> changes list, find time between that edit and the one it was
> reverting.

That one sounds easier. If only people wouldn't use rollback inappropriately...

Carcharoth

___
WikiEN-l mailing list
WikiEN-l@lists.wikimedia.org
To unsubscribe from this mailing list, visit:
https://lists.wikimedia.org/mailman/listinfo/wikien-l


Re: [WikiEN-l] Unreverted vandalism

2010-02-11 Thread Andrew Gray
On 11 February 2010 15:48, Carcharoth  wrote:

> So is it as big a problem as it seems? What percentage of vandalism
> doesn't get caught for days or weeks?

Well, here's a ballpark guess...

I've made 35,000 ish edits. I reckon maybe 10 or 20% of those are
routine reversing other people's contributions. vandalism, etc. Call
it 5,000.

I can think of perhaps a dozen cases where I've found very long-term
or similarly complicated vandalism; assume I've forgotten half of
them, or rolled-back without quite noticing the dates, and that
implies about 0.5% of vandalism edits are something Particularly
Remarkable.

Entirely unscientific, of course, but there you go.

The previous study relied on randomly sampling articles. Here's a
couple of other possibilities:

a) Use vandalism *reports*. People tend to write in to the complaints
address when they find vandalism irrespective of how long it's been
there, or how complicated it is; if they were going to revert it, the
odds are they wouldn't write. So, we could take a large sample of
vandalism report emails (all archived in OTRS), identify their
timestamp and the article being written about, find the relevant
vandalism edit, find its reversion.

b) Use reversions. Sample a thousand uses of rollback from the recent
changes list, find time between that edit and the one it was
reverting.

The first of these overestimates vandalism to high-traffic pages, and
so would *probably* lead to overcounting "young" vandalism - if we
make the reasonably safe assumption that high traffic = high editor
interest = many watchlists, etc.

The second of these falls down on undercounting old vandalism. Older
vandalism tends to require conventional editing, rather than the use
of rollback or undo, because there's higher odds the article's been
edited since.

Any other ideas?

-- 
- Andrew Gray
  andrew.g...@dunelm.org.uk

___
WikiEN-l mailing list
WikiEN-l@lists.wikimedia.org
To unsubscribe from this mailing list, visit:
https://lists.wikimedia.org/mailman/listinfo/wikien-l


Re: [WikiEN-l] Unreverted vandalism

2010-02-11 Thread Ian Woollard
The situation should improve if they *ever* enable flagged versions on
the English wikipedia. At the moment detecting vandalism is a bit
hit-and-miss; flagged versions should enable 100% checking.

That wouldn't completely stop vandalism, but it will greatly reduce
it. This should be true even if we just use the flags as a technique
to mark whether or not articles have been checked or not, rather than
determining whether they should be seen.

On 11/02/2010, Carcharoth  wrote:
> On Thu, Feb 11, 2010 at 3:54 PM, Thomas Dalton 
> wrote:
>> On 11 February 2010 15:48, Carcharoth  wrote:
>>> The latest example is here:
>>>
>>> http://en.wikipedia.org/w/index.php?title=Military_science&diff=339309229&oldid=337736730
>>>
>>> [I'm not at the right computer at the moment, so hopefully someone
>>> will fix that]
>>
>> Fixed.
>
> Thanks.
>
>>> So is it as big a problem as it seems? What percentage of vandalism
>>> doesn't get caught for days or weeks?
>>
>> http://en.wikipedia.org/wiki/User:Aetheling/Vandalism_survival
>>
>> That's a pretty good study, albeit with a very small sample size (100
>> articles).
>
> "an estimated 10% of all vandalism endures for months and even years
> indicates that some new tools and strategies are needed for rooting
> out the most subtle and persistent forms of vandalism"
>
> Quite a strong claim there.
>
> The talk page discussion is interesting.
>
> Carcharoth
>
> ___
> WikiEN-l mailing list
> WikiEN-l@lists.wikimedia.org
> To unsubscribe from this mailing list, visit:
> https://lists.wikimedia.org/mailman/listinfo/wikien-l
>


-- 
-Ian Woollard

___
WikiEN-l mailing list
WikiEN-l@lists.wikimedia.org
To unsubscribe from this mailing list, visit:
https://lists.wikimedia.org/mailman/listinfo/wikien-l


Re: [WikiEN-l] Unreverted vandalism

2010-02-11 Thread Carcharoth
On Thu, Feb 11, 2010 at 3:54 PM, Thomas Dalton  wrote:
> On 11 February 2010 15:48, Carcharoth  wrote:
>> The latest example is here:
>>
>> http://en.wikipedia.org/w/index.php?title=Military_science&diff=339309229&oldid=337736730
>>
>> [I'm not at the right computer at the moment, so hopefully someone
>> will fix that]
>
> Fixed.

Thanks.

>> So is it as big a problem as it seems? What percentage of vandalism
>> doesn't get caught for days or weeks?
>
> http://en.wikipedia.org/wiki/User:Aetheling/Vandalism_survival
>
> That's a pretty good study, albeit with a very small sample size (100 
> articles).

"an estimated 10% of all vandalism endures for months and even years
indicates that some new tools and strategies are needed for rooting
out the most subtle and persistent forms of vandalism"

Quite a strong claim there.

The talk page discussion is interesting.

Carcharoth

___
WikiEN-l mailing list
WikiEN-l@lists.wikimedia.org
To unsubscribe from this mailing list, visit:
https://lists.wikimedia.org/mailman/listinfo/wikien-l


Re: [WikiEN-l] Unreverted vandalism

2010-02-11 Thread Thomas Dalton
On 11 February 2010 15:48, Carcharoth  wrote:
> The latest example is here:
>
> http://en.wikipedia.org/w/index.php?title=Military_science&diff=339309229&oldid=337736730
>
> [I'm not at the right computer at the moment, so hopefully someone
> will fix that]

Fixed.

> So is it as big a problem as it seems? What percentage of vandalism
> doesn't get caught for days or weeks?

http://en.wikipedia.org/wiki/User:Aetheling/Vandalism_survival

That's a pretty good study, albeit with a very small sample size (100 articles).

___
WikiEN-l mailing list
WikiEN-l@lists.wikimedia.org
To unsubscribe from this mailing list, visit:
https://lists.wikimedia.org/mailman/listinfo/wikien-l


[WikiEN-l] Unreverted vandalism

2010-02-11 Thread Carcharoth
There was a discussion about unreverted vandalism on AN.

http://en.wikipedia.org/wiki/Wikipedia:Administrators%27_noticeboard#the_problem_of_vandalism

I often see unreverted vandalism that appears not to have been caught.
The latest example is here:

http://en.wikipedia.org/w/index.php?title=Military_science&diff=339309229&oldid=337736730

[I'm not at the right computer at the moment, so hopefully someone
will fix that]

So is it as big a problem as it seems? What percentage of vandalism
doesn't get caught for days or weeks?

Carcharoth

___
WikiEN-l mailing list
WikiEN-l@lists.wikimedia.org
To unsubscribe from this mailing list, visit:
https://lists.wikimedia.org/mailman/listinfo/wikien-l


Re: [WikiEN-l] Images that are PD in their country of origin

2010-02-11 Thread Carcharoth
On Mon, Feb 8, 2010 at 5:33 AM, Anthony  wrote:
> On Sun, Feb 7, 2010 at 11:13 PM, Liam Wyatt  wrote:
>
>> I agree that it is annoying to think of commons admins going to all this
>> trouble just for the benefit of unknown people selling t-shirts, but if
>> people *aren't* allowed to sell t-shirts then it's not free-culture
>> project.
>>
>
> It's not a free culture project.  It's a free "educational content" (1)
> project.
>
> (1) http://wikimediafoundation.org/wiki/Mission_statement

Really, to give the context, you need to quote it in full:

"The mission of the Wikimedia Foundation is to empower and engage
people around the world to collect and develop educational content
under a free license or in the public domain, and to disseminate it
effectively and globally.

In collaboration with a network of chapters, the Foundation provides
the essential infrastructure and an organizational framework for the
support and development of multilingual wiki projects and other
endeavors which serve this mission. The Foundation will make and keep
useful information from its projects available on the Internet free of
charge, in perpetuity."

I agree that the "educational content" and "free license or in the
public domain" aspects do often conflict, but both aspects need to be
borne in mind when debating such cases.

Carcharoth

___
WikiEN-l mailing list
WikiEN-l@lists.wikimedia.org
To unsubscribe from this mailing list, visit:
https://lists.wikimedia.org/mailman/listinfo/wikien-l


Re: [WikiEN-l] Images that are PD in their country of origin

2010-02-11 Thread Anthony
On Sun, Feb 7, 2010 at 11:13 PM, Liam Wyatt  wrote:

> I agree that it is annoying to think of commons admins going to all this
> trouble just for the benefit of unknown people selling t-shirts, but if
> people *aren't* allowed to sell t-shirts then it's not free-culture
> project.
>

It's not a free culture project.  It's a free "educational content" (1)
project.

(1) http://wikimediafoundation.org/wiki/Mission_statement
___
WikiEN-l mailing list
WikiEN-l@lists.wikimedia.org
To unsubscribe from this mailing list, visit:
https://lists.wikimedia.org/mailman/listinfo/wikien-l


Re: [WikiEN-l] Images that are PD in their country of origin

2010-02-11 Thread Anthony
On Sun, Feb 7, 2010 at 9:29 AM, SlimVirgin  wrote:

> Can anyone help with an authoritative opinion about this? The doubts about
> it are causing problems on a number of articles, including during featured
> article reviews.
>
> Where an image is in the public domain in its country of origin, and that
> country is not the U.S., I believe we still have to show that it is PD in
> the U.S. before we can use it, because the Foundation's servers are in the
> U.S..


Who is the "we"?  It's not clear to me how the location of the Foundation's
servers would be relevant.
___
WikiEN-l mailing list
WikiEN-l@lists.wikimedia.org
To unsubscribe from this mailing list, visit:
https://lists.wikimedia.org/mailman/listinfo/wikien-l