Re: [Labs-l] Google bot

2014-10-27 Thread Marc A. Pelletier
On 10/25/2014 07:37 PM, Nuria wrote: Much agree with these recommendations. Personally, I have no beef with it either - but filtering at the proxy level means this necessarily happens to every tool with no opportunity to do it differently per-tool so we probably don't want to be overly sensitive

Re: [Labs-l] Google bot

2014-10-25 Thread Federico Leva (Nemo)
As Nuria, Billinghurst and others said, the tools are expected to be discoverable. It's easy enough not to throw away the baby with the bathwather*. * Dynamic pages generally have some URL parameters, usually indicated by ?. In the general robots.txt, disallow Googlebot and friends** to crawl

Re: [Labs-l] Google bot

2014-10-20 Thread Magnus Manske
: [Labs-l] Google bot On 10/19/2014 03:50 PM, Magnus Manske wrote: I vaguely remember that indexing bots (like the Google one) were filtered out by Labs already? They were, for some time, but then I got some fairly vehement protestations that tools being unindexed by Google was a problem

Re: [Labs-l] Google bot

2014-10-20 Thread Wiki Billinghurst
To: labs-l@lists.wikimedia.org Subject: Re: [Labs-l] Google bot On 10/19/2014 03:50 PM, Magnus Manske wrote: I vaguely remember that indexing bots (like the Google one) were filtered out by Labs already? They were, for some time, but then I got some fairly vehement protestations

Re: [Labs-l] Google bot

2014-10-20 Thread Maciej Jaros
Magnus Manske (2014-10-20 13:33): [...] AFAIK, all tools use a default .lighttp configuration by default. Is that replaced or extended by a local config file? If it's replaced, the default config could exclude Googlebot, and even a blank local config file would re-enable Googlebot again, for

[Labs-l] Google bot

2014-10-19 Thread Magnus Manske
Hi, I saw a high load (dozens of queries [1]) hitting one of my tools (catscan2). The queries looked like they came from a template on French Wikipedia (category name different, other parameters the same). Access log shows (among other things) Google bot. When I added that to my bot exclusion

Re: [Labs-l] Google bot

2014-10-19 Thread Maximilian Doerr
You are correct. I believe I’m the one that initially brought that to Coren’s attention about that. Cyberpower678 English Wikipedia Account Creation Team Mailing List Moderator On Oct 19, 2014, at 15:50, Magnus Manske magnusman...@googlemail.com wrote: Hi, I saw a high load (dozens of

Re: [Labs-l] Google bot

2014-10-19 Thread Marc A. Pelletier
On 10/19/2014 03:50 PM, Magnus Manske wrote: I vaguely remember that indexing bots (like the Google one) were filtered out by Labs already? They were, for some time, but then I got some fairly vehement protestations that tools being unindexed by Google was a problem. -- Marc

Re: [Labs-l] Google bot

2014-10-19 Thread Maximilian Doerr
, October 19, 2014 7:29 PM To: labs-l@lists.wikimedia.org Subject: Re: [Labs-l] Google bot On 10/19/2014 03:50 PM, Magnus Manske wrote: I vaguely remember that indexing bots (like the Google one) were filtered out by Labs already? They were, for some time, but then I got some fairly vehement

Re: [Labs-l] Google bot

2014-10-19 Thread Nuria Ruiz
, October 19, 2014 7:29 PM To: labs-l@lists.wikimedia.org Subject: Re: [Labs-l] Google bot On 10/19/2014 03:50 PM, Magnus Manske wrote: I vaguely remember that indexing bots (like the Google one) were filtered out by Labs already? They were, for some time, but then I got some fairly vehement

Re: [Labs-l] Google bot

2014-10-19 Thread Maximilian Doerr
mailto:labs-l-boun...@lists.wikimedia.org [mailto:labs-l-boun...@lists.wikimedia.org mailto:labs-l-boun...@lists.wikimedia.org] On Behalf Of Marc A. Pelletier Sent: Sunday, October 19, 2014 7:29 PM To: labs-l@lists.wikimedia.org mailto:labs-l@lists.wikimedia.org Subject: Re: [Labs-l

Re: [Labs-l] Google bot

2014-10-19 Thread Gerard Meijssen
Message- From: labs-l-boun...@lists.wikimedia.org [mailto: labs-l-boun...@lists.wikimedia.org] On Behalf Of Marc A. Pelletier Sent: Sunday, October 19, 2014 7:29 PM To: labs-l@lists.wikimedia.org Subject: Re: [Labs-l] Google bot On 10/19/2014 03:50 PM, Magnus Manske wrote: I vaguely