On Mar 18, 2009, at 20:00, Andrew Garrett and...@epstone.net wrote:
To help a bit more with performance, I've also added a profiler within
the interface itself. Hopefully this will encourage self-policing with
regard to filter performance.
Awesome!
Maybe we could use that for templates too
On 3/19/09 5:15 AM, Tei wrote:
since theres already a database, this sounds like could be done flagging
edits as vandalism, and then reading the existing database information to
extract these details, like ip, a diff of the change, etc.. that way,
humans define what is a vandalism, and the
On Wed, Mar 18, 2009 at 8:00 PM, Andrew Garrett and...@epstone.net wrote:
snip
To help a bit more with performance, I've also added a profiler within
the interface itself. Hopefully this will encourage self-policing with
regard to filter performance.
Based on personal observations, the
Cobi (owner of ClueBot) and his roomate Crispy have already been
working hard to make this specific dataset, but they've been hurt by
not enough contributors. The page is here: http://en.wikipedia.org/
wiki/User:Crispy1989#New_Dataset_Contribution_Interface
X!
On Mar 19, 2009, at 8:15 AM
I presented a talk at Wikimania 2007 that espoused the virtues of
combining human measures of content with automatically determined
measures in order to generalize to unseen instances. Unfortunately all
those Wikimania talks seem to have been lost. It was related to this
article on predicting the
Brian wrote:
Delerium, you do make it sound as if merely having the tagged dataset
solves the entire problem. But there are really multiple problems. One
is learning to classify what you have been told is in the dataset
(e.g., that all instances of this rule in the edit history *really
are*
Brian wrote:
I just wanted to be really clear about what I mean as a specific
counter-example to this just being an example of reconstructing that
rule set. Suppose you use the AbuseFilter rules on the entire history
of the wiki in order to generate a dataset of positive and negative
examples
Ultimately we need a system that integrates information from multiple
sources, such as WikiTrust, AbuseFilter and the Wikipedia Editorial
Team.
A general point - there is a *lot* of information contained in edits
that AbuseFilter cannot practically characterize due to the complexity
of language
On Thu, Mar 19, 2009 at 5:26 PM, Brian brian.min...@colorado.edu wrote:
A general point - there is a *lot* of information contained in edits
that AbuseFilter cannot practically characterize due to the complexity
of language and the subtelty of certain types of abuse. A system with
access to
2009/3/19 Aryeh Gregor simetrical+wikil...@gmail.com:
On Thu, Mar 19, 2009 at 5:26 PM, Brian brian.min...@colorado.edu wrote:
A general point - there is a *lot* of information contained in edits
that AbuseFilter cannot practically characterize due to the complexity
of language and the
Andrew Garrett wrote:
On Thu, Mar 19, 2009 at 11:54 AM, Platonides platoni...@gmail.com wrote:
PS: Why there isn't a link to Special:AbuseFilter/history/$id on the
filter view?
There is.
Oops. I was looking for it on the top bar, not at the bottom. I stay
corrected.
I am pleased to announce that the Abuse Filter [1] has been activated
on English Wikipedia!
The Abuse Filter is an extension to the MediaWiki [2] software that
powers Wikipedia allowing automatic filters or rules to be run
against every edit, and to take actions if any of those rules are
On 3/18/09 5:34 AM, Andrew Garrett wrote:
I am pleased to announce that the Abuse Filter [1] has been activated
on English Wikipedia!
I've temporarily disabled it as we're seeing some performance problems
saving edits at peak time today. Need to make sure there's functional
per-filter
On Wed, Mar 18, 2009 at 12:43 PM, Brion Vibber br...@wikimedia.org wrote:
On 3/18/09 5:34 AM, Andrew Garrett wrote:
I am pleased to announce that the Abuse Filter [1] has been activated
on English Wikipedia!
I've temporarily disabled it as we're seeing some performance problems
saving edits
Brion Vibber wrote:
On 3/18/09 5:34 AM, Andrew Garrett wrote:
I am pleased to announce that the Abuse Filter [1] has been activated
on English Wikipedia!
I've temporarily disabled it as we're seeing some performance problems
saving edits at peak time today. Need to make sure there's
On Wed, Mar 18, 2009 at 12:59 PM, Tim Starling tstarl...@wikimedia.org wrote:
Brion Vibber wrote:
On 3/18/09 5:34 AM, Andrew Garrett wrote:
I am pleased to announce that the Abuse Filter [1] has been activated
on English Wikipedia!
I've temporarily disabled it as we're seeing some
On 3/18/09 12:59 PM, Tim Starling wrote:
Brion Vibber wrote:
On 3/18/09 5:34 AM, Andrew Garrett wrote:
I am pleased to announce that the Abuse Filter [1] has been activated
on English Wikipedia!
I've temporarily disabled it as we're seeing some performance problems
saving edits at peak time
Robert Rohde wrote:
For Andrew or anyone else that knows, can we assume that the filter is
smart enough that if the first part of an AND clause fails then the
other parts don't run (or similarly if the first part of an OR
succeeds)? If so, we can probably optimize rules by doing easy checks
AG frown on page-blanking
For now I just stop them on my wikis with
$wgSpamRegex=array('/^\B$/');
I haven't tried fancier solutions yet.
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
This extension is very important for training machine learning
vandalism detection bots. Recently published systems use only hundreds
of examples of vandalism in training - not nearly enough to
distinguish between the variety found in Wikipedia or generalize to
new, unseen forms of vandalism. A
However, that simply disallows them all. On enwiki, the blanking
filter warns the user, and lets them go through with it after
confirmation.
X!
On Mar 18, 2009, at 4:51 PM [Mar 18, 2009 ], jida...@jidanni.org wrote:
AG frown on page-blanking
For now I just stop them on my wikis with
Tim Starling wrote:
Robert Rohde wrote:
For Andrew or anyone else that knows, can we assume that the filter is
smart enough that if the first part of an AND clause fails then the
other parts don't run (or similarly if the first part of an OR
succeeds)? If so, we can probably optimize rules
Tim Starling wrote:
Robert Rohde wrote:
For Andrew or anyone else that knows, can we assume that the filter is
smart enough that if the first part of an AND clause fails then the
other parts don't run (or similarly if the first part of an OR
succeeds)? If so, we can probably optimize rules
On Wed, Mar 18, 2009 at 8:00 PM, Andrew Garrett and...@epstone.net wrote:
snip
I've disabled a filter or two which were taking well in excess of
150ms to run, and seemed to be targetted at specific vandals, without
any hits. The culprit seemed to be running about 20 regexes to
determine if an
24 matches
Mail list logo