Re: [Wikitech-l] Getting the list of Page Titles and Redirects of Wikipedia

2009-03-19 Thread Roan Kattouw
Platonides schreef: (it's helpfully provided in the API result . . . actually, what does it mean that Portal and Portal talk are canonical? shouldn't there be no canonical attribute if the namespace is custom?). Agree. Portal and Portal talk could still be acceptable, since the namespace

Re: [Wikitech-l] Serving as xhtml+xml

2009-03-19 Thread Alexandre Emsenhuber
Le 19.3.2009 4:46, « lee worden » won...@riseup.net a écrit : Attached is a patch for the skins directory that allows changing the Content-type dynamically. After applying this patch, if any code sets the global $wgServeAsXHTML to true, the page will be output with the xhtml+xml content

Re: [Wikitech-l] Serving as xhtml+xml

2009-03-19 Thread Aryeh Gregor
2009/3/18 lee worden won...@riseup.net: Attached is a patch for the skins directory that allows changing the Content-type dynamically.  After applying this patch, if any code sets the global $wgServeAsXHTML to true, the page will be output with the xhtml+xml content type.  This seems to work

Re: [Wikitech-l] Abuse Filter extension activated on English Wikipedia

2009-03-19 Thread Brion Vibber
On Mar 18, 2009, at 20:00, Andrew Garrett and...@epstone.net wrote: To help a bit more with performance, I've also added a profiler within the interface itself. Hopefully this will encourage self-policing with regard to filter performance. Awesome! Maybe we could use that for templates too

Re: [Wikitech-l] Serving as xhtml+xml

2009-03-19 Thread Brion Vibber
What's to patch? This is already a configuration variable, just set it... -- brion vibber (brion @ wikimedia.org) On Mar 18, 2009, at 21:24, Andrew Garrett agarr...@wikimedia.org wrote: 2009/3/19 lee worden won...@riseup.net: I'm at work on a MW extension that, among other things, uses

Re: [Wikitech-l] Abuse Filter extension activated on English Wikipedia

2009-03-19 Thread Brion Vibber
On 3/19/09 5:15 AM, Tei wrote: since theres already a database, this sounds like could be done flagging edits as vandalism, and then reading the existing database information to extract these details, like ip, a diff of the change, etc.. that way, humans define what is a vandalism, and the

Re: [Wikitech-l] Abuse Filter extension activated on English Wikipedia

2009-03-19 Thread Robert Rohde
On Wed, Mar 18, 2009 at 8:00 PM, Andrew Garrett and...@epstone.net wrote: snip To help a bit more with performance, I've also added a profiler within the interface itself. Hopefully this will encourage self-policing with regard to filter performance. Based on personal observations, the

Re: [Wikitech-l] Abuse Filter extension activated on English Wikipedia

2009-03-19 Thread Soxred93
Cobi (owner of ClueBot) and his roomate Crispy have already been working hard to make this specific dataset, but they've been hurt by not enough contributors. The page is here: http://en.wikipedia.org/ wiki/User:Crispy1989#New_Dataset_Contribution_Interface X! On Mar 19, 2009, at 8:15 AM

Re: [Wikitech-l] Serving as xhtml+xml

2009-03-19 Thread lee worden
This has been done before, for instance in the ASCIIMath4Wiki extension [2].  I don't want to change the Content-type unconditionally, though, only some of the time, so that we can serve texvc-style images to browsers or users that don't like the modified content type. Note that this will

Re: [Wikitech-l] Abuse Filter extension activated on English Wikipedia

2009-03-19 Thread Brian
I presented a talk at Wikimania 2007 that espoused the virtues of combining human measures of content with automatically determined measures in order to generalize to unseen instances. Unfortunately all those Wikimania talks seem to have been lost. It was related to this article on predicting the

Re: [Wikitech-l] Abuse Filter extension activated on English Wikipedia

2009-03-19 Thread Delirium
Brian wrote: Delerium, you do make it sound as if merely having the tagged dataset solves the entire problem. But there are really multiple problems. One is learning to classify what you have been told is in the dataset (e.g., that all instances of this rule in the edit history *really are*

Re: [Wikitech-l] Abuse Filter extension activated on English Wikipedia

2009-03-19 Thread Delirium
Brian wrote: I just wanted to be really clear about what I mean as a specific counter-example to this just being an example of reconstructing that rule set. Suppose you use the AbuseFilter rules on the entire history of the wiki in order to generate a dataset of positive and negative examples

Re: [Wikitech-l] Abuse Filter extension activated on English Wikipedia

2009-03-19 Thread Brian
Ultimately we need a system that integrates information from multiple sources, such as WikiTrust, AbuseFilter and the Wikipedia Editorial Team. A general point - there is a *lot* of information contained in edits that AbuseFilter cannot practically characterize due to the complexity of language

Re: [Wikitech-l] Abuse Filter extension activated on English Wikipedia

2009-03-19 Thread Aryeh Gregor
On Thu, Mar 19, 2009 at 5:26 PM, Brian brian.min...@colorado.edu wrote: A general point - there is a *lot* of information contained in edits that AbuseFilter cannot practically characterize due to the complexity of language and the subtelty of certain types of abuse. A system with access to

Re: [Wikitech-l] Abuse Filter extension activated on English Wikipedia

2009-03-19 Thread David Gerard
2009/3/19 Aryeh Gregor simetrical+wikil...@gmail.com: On Thu, Mar 19, 2009 at 5:26 PM, Brian brian.min...@colorado.edu wrote: A general point - there is a *lot* of information contained in edits that AbuseFilter cannot practically characterize due to the complexity of language and the

Re: [Wikitech-l] Abuse Filter extension activated on English Wikipedia

2009-03-19 Thread Platonides
Andrew Garrett wrote: On Thu, Mar 19, 2009 at 11:54 AM, Platonides platoni...@gmail.com wrote: PS: Why there isn't a link to Special:AbuseFilter/history/$id on the filter view? There is. Oops. I was looking for it on the top bar, not at the bottom. I stay corrected.