Andrew Garrett wrote:
> On Thu, Mar 19, 2009 at 11:54 AM, Platonides wrote:
>> PS: Why there isn't a link to Special:AbuseFilter/history/$id on the
>> filter view?
>
> There is.
Oops. I was looking for it on the top bar, not at the bottom. I stay
corrected.
2009/3/19 Aryeh Gregor :
> On Thu, Mar 19, 2009 at 5:26 PM, Brian wrote:
>> A general point - there is a *lot* of information contained in edits
>> that AbuseFilter cannot practically characterize due to the complexity
>> of language and the subtelty of certain types of abuse. A system with
>> ac
On Thu, Mar 19, 2009 at 5:26 PM, Brian wrote:
> A general point - there is a *lot* of information contained in edits
> that AbuseFilter cannot practically characterize due to the complexity
> of language and the subtelty of certain types of abuse. A system with
> access to natural language feature
Ultimately we need a system that integrates information from multiple
sources, such as WikiTrust, AbuseFilter and the Wikipedia Editorial
Team.
A general point - there is a *lot* of information contained in edits
that AbuseFilter cannot practically characterize due to the complexity
of language an
Brian wrote:
> I just wanted to be really clear about what I mean as a specific
> counter-example to this just being an example of "reconstructing that
> rule set." Suppose you use the AbuseFilter rules on the entire history
> of the wiki in order to generate a dataset of positive and negative
> ex
Brian wrote:
> Delerium, you do make it sound as if merely having the tagged dataset
> solves the entire problem. But there are really multiple problems. One
> is learning to classify what you have been told is in the dataset
> (e.g., that all instances of this rule in the edit history *really
> ar
I just wanted to be really clear about what I mean as a specific
counter-example to this just being an example of "reconstructing that
rule set." Suppose you use the AbuseFilter rules on the entire history
of the wiki in order to generate a dataset of positive and negative
examples of vandalism edi
I presented a talk at Wikimania 2007 that espoused the virtues of
combining human measures of content with automatically determined
measures in order to generalize to unseen instances. Unfortunately all
those Wikimania talks seem to have been lost. It was related to this
article on predicting the q
On 3/19/09 12:21 PM, Alex wrote:
> Yes, in one filter (filter 32) I've been watching, it was taking
> 90-120ms for what seemed like simple checks (action, editcount,
> difference in bytes), so I moved the editcount check last, in case it
> had to pull that from the DB. The time dropped to ~3ms, but
Robert Rohde wrote:
> On Wed, Mar 18, 2009 at 8:00 PM, Andrew Garrett wrote:
>
>> To help a bit more with performance, I've also added a profiler within
>> the interface itself. Hopefully this will encourage self-policing with
>> regard to filter performance.
>
> Based on personal observations,
This has been done before, for instance in the ASCIIMath4Wiki extension [2].
I don't want to change the Content-type unconditionally, though, only some
of the time, so that we can serve texvc-style images to browsers or users
that don't like the modified content type.
Note that this will interf
Cobi (owner of ClueBot) and his roomate Crispy have already been
working hard to make this specific dataset, but they've been hurt by
not enough contributors. The page is here: http://en.wikipedia.org/
wiki/User:Crispy1989#New_Dataset_Contribution_Interface
X!
On Mar 19, 2009, at 8:15 AM [Ma
On Wed, Mar 18, 2009 at 8:00 PM, Andrew Garrett wrote:
>
> To help a bit more with performance, I've also added a profiler within
> the interface itself. Hopefully this will encourage self-policing with
> regard to filter performance.
Based on personal observations, the self-profiling is quite n
On 3/19/09 5:15 AM, Tei wrote:
> since theres already a database, this sounds like could be done flagging
> edits as "vandalism", and then reading the existing database information to
> extract these details, like ip, a diff of the change, etc.. that way,
> humans define what is a "vandalism", a
What's to patch? This is already a configuration variable, just set
it...
-- brion vibber (brion @ wikimedia.org)
On Mar 18, 2009, at 21:24, Andrew Garrett
wrote:
> 2009/3/19 lee worden :
>> I'm at work on a MW extension that, among other things, uses
>> LaTeXML [1] to
>> make XHTML from
On Mar 18, 2009, at 20:00, Andrew Garrett wrote:
>
>>
> To help a bit more with performance, I've also added a profiler within
> the interface itself. Hopefully this will encourage self-policing with
> regard to filter performance.
Awesome!
Maybe we could use that for templates too ... ;)
-- Br
2009/3/18 lee worden :
> Attached is a patch for the skins directory that allows changing the
> Content-type dynamically. After applying this patch, if any code sets the
> global $wgServeAsXHTML to true, the page will be output with the xhtml+xml
> content type. This seems to work fine with the e
On Thu, Mar 19, 2009 at 1:03 PM, Delirium wrote:
> Brian wrote:
> > This extension is very important for training machine learning
> > vandalism detection bots. Recently published systems use only hundreds
> > of examples of vandalism in training - not nearly enough to
> > distinguish between th
Brian wrote:
> This extension is very important for training machine learning
> vandalism detection bots. Recently published systems use only hundreds
> of examples of vandalism in training - not nearly enough to
> distinguish between the variety found in Wikipedia or generalize to
> new, unseen f
Daniel Kinzler brightbyte.de> writes:
> Sorry, I can't really see it happening.
Basicaly you are marking this bug as WONTFIX?
/Micke
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech
Le 19.3.2009 4:46, « lee worden » a écrit :
> Attached is a patch for the skins directory that allows changing the
> Content-type dynamically. After applying this patch, if any code sets the
> global $wgServeAsXHTML to true, the page will be output with the xhtml+xml
> content type. This seems
Platonides schreef:
>> (it's helpfully provided in the API result . . . actually, what
>> does it mean that "Portal" and "Portal talk" are canonical? shouldn't
>> there be no canonical attribute if the namespace is custom?).
>
> Agree. Portal and Portal talk could still be acceptable, since the
>
22 matches
Mail list logo