On 10/25/2014 07:37 PM, Nuria wrote:
Much agree with these recommendations.
Personally, I have no beef with it either - but filtering at the proxy
level means this necessarily happens to every tool with no opportunity
to do it differently per-tool so we probably don't want to be overly
sensitive
As Nuria, Billinghurst and others said, the tools are expected to be
discoverable. It's easy enough not to throw away the baby with the
bathwather*.
* Dynamic pages generally have some URL parameters, usually indicated by
?. In the general robots.txt, disallow Googlebot and friends** to crawl
: [Labs-l] Google bot
On 10/19/2014 03:50 PM, Magnus Manske wrote:
I vaguely remember that indexing bots (like the Google one) were
filtered out by Labs already?
They were, for some time, but then I got some fairly vehement
protestations that tools being unindexed by Google was a problem
To: labs-l@lists.wikimedia.org
Subject: Re: [Labs-l] Google bot
On 10/19/2014 03:50 PM, Magnus Manske wrote:
I vaguely remember that indexing bots (like the Google one) were
filtered out by Labs already?
They were, for some time, but then I got some fairly vehement
protestations
Magnus Manske (2014-10-20 13:33):
[...]
AFAIK, all tools use a default .lighttp configuration by default. Is
that replaced or extended by a local config file?
If it's replaced, the default config could exclude Googlebot, and even
a blank local config file would re-enable Googlebot again, for
Hi,
I saw a high load (dozens of queries [1]) hitting one of my tools
(catscan2). The queries looked like they came from a template on French
Wikipedia (category name different, other parameters the same). Access log
shows (among other things) Google bot. When I added that to my bot
exclusion
You are correct. I believe I’m the one that initially brought that to Coren’s
attention about that.
Cyberpower678
English Wikipedia Account Creation Team
Mailing List Moderator
On Oct 19, 2014, at 15:50, Magnus Manske magnusman...@googlemail.com wrote:
Hi,
I saw a high load (dozens of
On 10/19/2014 03:50 PM, Magnus Manske wrote:
I vaguely remember that indexing bots (like the Google one) were
filtered out by Labs already?
They were, for some time, but then I got some fairly vehement
protestations that tools being unindexed by Google was a problem.
-- Marc
, October 19, 2014 7:29 PM
To: labs-l@lists.wikimedia.org
Subject: Re: [Labs-l] Google bot
On 10/19/2014 03:50 PM, Magnus Manske wrote:
I vaguely remember that indexing bots (like the Google one) were
filtered out by Labs already?
They were, for some time, but then I got some fairly vehement
, October 19, 2014 7:29 PM
To: labs-l@lists.wikimedia.org
Subject: Re: [Labs-l] Google bot
On 10/19/2014 03:50 PM, Magnus Manske wrote:
I vaguely remember that indexing bots (like the Google one) were
filtered out by Labs already?
They were, for some time, but then I got some fairly vehement
mailto:labs-l-boun...@lists.wikimedia.org
[mailto:labs-l-boun...@lists.wikimedia.org
mailto:labs-l-boun...@lists.wikimedia.org] On Behalf Of Marc A. Pelletier
Sent: Sunday, October 19, 2014 7:29 PM
To: labs-l@lists.wikimedia.org mailto:labs-l@lists.wikimedia.org
Subject: Re: [Labs-l
Message-
From: labs-l-boun...@lists.wikimedia.org [mailto:
labs-l-boun...@lists.wikimedia.org] On Behalf Of Marc A. Pelletier
Sent: Sunday, October 19, 2014 7:29 PM
To: labs-l@lists.wikimedia.org
Subject: Re: [Labs-l] Google bot
On 10/19/2014 03:50 PM, Magnus Manske wrote:
I vaguely
12 matches
Mail list logo