Re: [Wikitech-l] User-Agent:

2010-02-15 Thread William Pietri
On 02/15/2010 07:55 PM, Domas Mituzas wrote: >> Yes, a simple restriction like this tends to create smarter villains >> rather than less villainy. Filtering on an obvious, easy-to-change >> characteristic also destroys a useful source of information on who the >> bad people are, making future abuse

Re: [Wikitech-l] User-Agent:

2010-02-15 Thread Domas Mituzas
William, > Yes, a simple restriction like this tends to create smarter villains > rather than less villainy. Filtering on an obvious, easy-to-change > characteristic also destroys a useful source of information on who the > bad people are, making future abuse prevention efforts harder. Thanks

Re: [Wikitech-l] User-Agent:

2010-02-15 Thread William Pietri
On 02/15/2010 06:50 PM, Steve Summit wrote: > You're trying to detect / guard against malicious behavior using > *User-Agent*?? Good grief. Have fun with the whack-a-mole game, then. > Yes, a simple restriction like this tends to create smarter villains rather than less villainy. Filtering

Re: [Wikitech-l] User-Agent:

2010-02-15 Thread William Pietri
On 02/15/2010 07:25 PM, Domas Mituzas wrote: >> Was there some urgent production impact that required doing this with no >> notice? >> > Actually we had User-Agent header requirement for ages, it just failed to do > what it had to do for a while. Consider this to be a bugfix. > Ok. I'm

Re: [Wikitech-l] User-Agent:

2010-02-15 Thread DaB.
Hello, Am Dienstag 16 Februar 2010 04:15:57 schrieb William Pietri: > some third-world traffic why should browser in the 3. world not send user-agents like our browsers (I doubt that they use others then we)? The change by domas just blocks 2 kinds of requests: 1.) By broken bots and crawlers an

[Wikitech-l] Apple RSS

2010-02-15 Thread Domas Mituzas
Hi! I blocked Apple's MacOSX RSS syndicator - it probably wastes petabytes of diskspace on unsuspecting user machines, as any open of Wikipedia's RSS feed in Safari will actually be automatically added to the syndication list and resynced constantly. Isn't that painful with other feeds, but Wik

Re: [Wikitech-l] User-Agent:

2010-02-15 Thread Domas Mituzas
Hi! > Was there some urgent production impact that required doing this with no > notice? Actually we had User-Agent header requirement for ages, it just failed to do what it had to do for a while. Consider this to be a bugfix. > Was any impact analysis done on this? Yup! > Given Wikipedia'

Re: [Wikitech-l] User-Agent:

2010-02-15 Thread William Pietri
On 02/15/2010 05:54 PM, Domas Mituzas wrote: > Hi! > > from now on specific per-bot/per-software/per-client User-Agent header is > mandatory for contacting Wikimedia sites. > Two questions: Was there some urgent production impact that required doing this with no notice? Was any impact anal

Re: [Wikitech-l] User-Agent:

2010-02-15 Thread Domas Mituzas
Hi! > You're trying to detect / guard against malicious behavior using > *User-Agent*?? Good grief. Have fun with the whack-a-mole game, then. Thanks! I'm relatively new to this all operations game, so I'm obsessed about graphs and whack-a-mole :) Cheers, Domas __

Re: [Wikitech-l] User-Agent:

2010-02-15 Thread Steve Summit
>> Relying on User-Agent represents the very antithesis of >> [[Postel's Law]], a rock-solid principle o which the Internet >> (used to be) based. > > RFC2616: > 14.43 User-Agent > The User-Agent request-header field... is for... automated > recognition of user agents for the sake of tailoring > re

Re: [Wikitech-l] User-Agent:

2010-02-15 Thread Steve Summit
Domas wrote: > Hi Steve, > > But why? > > Because we need to identify malicious behavior. You're trying to detect / guard against malicious behavior using *User-Agent*?? Good grief. Have fun with the whack-a-mole game, then. ___ Wikitech-l mailing li

Re: [Wikitech-l] User-Agent:

2010-02-15 Thread Domas Mituzas
Steve, > If this has been discussed to death elsewhere and represents > some bizarrely-informed consensus, I'll try to spare this list > my belated rantings, but this is a terrible, terrible idea. > Relying on User-Agent represents the very antithesis of > [[Postel's Law]], a rock-solid principle

Re: [Wikitech-l] User-Agent:

2010-02-15 Thread Domas Mituzas
Hi Steve, > But why? Because we need to identify malicious behavior. > (This just broke one of my bots.) > Are the details of this policy discussed anywhere? I don't know. Probably. We always told people to specify User-Agent, just the check was broken. > Is it permissible to send > >

Re: [Wikitech-l] User-Agent:

2010-02-15 Thread DaB.
Hello, Am Dienstag 16 Februar 2010 03:06:49 schrieb Steve Summit: > Is it permissible to send > > User-Agent: x why is it so hard to set User-Agent: mytoolname/version mym...@mail.invalid ? (you can forgo the mail if you paranoid) It's clean, fast and good. Sincerly, DaB. -- wp-blog.

Re: [Wikitech-l] User-Agent:

2010-02-15 Thread Steve Summit
Domas wrote: > from now on specific per-bot/per-software/per-client User-Agent > header is mandatory for contacting Wikimedia sites. Oh, my. And not just to be a bot, or to edit the site manually, but even to view it. You can't even fetch a single, simple page now without supplying that header.

[Wikitech-l] More dump problems?

2010-02-15 Thread Mike.lifeguard
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Hi guys, Just wanted to be sure this got checked out. From #mediawiki: hi. out of curiosity, are there any known issues with the recent batch of dumps? I noticed that some supposedly completed dumps seem to have ended with "Please provide a User-Age

Re: [Wikitech-l] User-Agent:

2010-02-15 Thread Steve Summit
Domas wrote: > from now on specific per-bot/per-software/per-client User-Agent > header is mandatory for contacting Wikimedia sites. But why? (This just broke one of my bots.) Are the details of this policy discussed anywhere? Is it permissible to send User-Agent: x thus providing pre

Re: [Wikitech-l] User-Agent:

2010-02-15 Thread Chad
On Mon, Feb 15, 2010 at 8:54 PM, Domas Mituzas wrote: > Hi! > > from now on specific per-bot/per-software/per-client User-Agent header is > mandatory for contacting Wikimedia sites. > > Domas > ___ > Wikitech-l mailing list > Wikitech-l@lists.wikimedia.

[Wikitech-l] User-Agent:

2010-02-15 Thread Domas Mituzas
Hi! from now on specific per-bot/per-software/per-client User-Agent header is mandatory for contacting Wikimedia sites. Domas ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l

[Wikitech-l] enwiki complete page edit history

2010-02-15 Thread Jamie Morken
Hi, I was looking at the enwiki dump progress and noticed the file size for the enwiki pages-meta-history.xml.bz2 has decreased from 255GB on 20100125 down to 105GB on 20100203.  Is it possible that old page revision edit data is being lost due to the smaller archive file size? 2009-12-03 12:53:

Re: [Wikitech-l] importing enwiki into local database

2010-02-15 Thread Aryeh Gregor
On Mon, Feb 15, 2010 at 2:19 PM, Carl (CBM) wrote: > I hope that, before the doctype is changed to html5, a substantial > grace period is given for people to change to an HTML5 parser in their > javascript code. We will continue with well-formed XML output for the foreseeable future for exactly t

Re: [Wikitech-l] importing enwiki into local database

2010-02-15 Thread Carl (CBM)
On Sun, Feb 14, 2010 at 7:34 PM, Marco Schuster wrote: > What about turning wgUseTidy off for some time? The doctype that we serve is XHTML, and various AJAX tools rely on being able to parse the DOM tree as an XML document. But there are certain valid wikitext constructions that are ''guarantee

Re: [Wikitech-l] Extensions in SVN looking for a maintainer

2010-02-15 Thread Siebrand Mazeland
Thanks, Avar. Cite: no action taken in Bugzilla. Newuserlog: has been removed. Can unfortunately not close components. CrossNamespaceLinks: added Avar as maintainer Desysop: has been removed. See above. Espionage: had one closed issue; reassigned to CheckUser and component deleted. Eval: added Ava

Re: [Wikitech-l] Extensions in SVN looking for a maintainer

2010-02-15 Thread Roan Kattouw
2010/2/15 Ævar Arnfjörð Bjarmason : > Domas has also complained that it eats up resources. Is this something > that can conceivably be fixed in it or is it just inherent in anything > that calls the parser from an extension tag and will thus need parser > fixups to get anywhere? > IIRC Domas was co