Re: Don't let search bots look at buglist.cgi

Richard Guenther Mon, 16 May 2011 06:42:55 -0700

On Mon, May 16, 2011 at 3:34 PM, Andrew Haley <a...@redhat.com> wrote:
> On 05/16/2011 02:32 PM, Michael Matz wrote:
>>
>> On Mon, 16 May 2011, Andrew Haley wrote:
>>
>>>> It routinely is.  bugzilla performance is terrible most of the time
>>>> for me (up to the point of five timeouts in sequence), svn speed is
>>>> mediocre at best, and people with access to gcc.gnu.org often observe
>>>> loads > 25, mostly due to I/O .
>>>
>>> And how have you concluded that is due to web crawlers?
>>
>> httpd being in the top-10 always, fiddling with bugzilla URLs?
>> (Note, I don't have access to gcc.gnu.org, I'm relaying info from multiple
>> instances of discussion on #gcc and richi poking on it; that said, it
>> still might not be web crawlers, that's right, but I'll happily accept
>> _any_ load improvement on gcc.gnu.org, how unfounded they might seem)
>
> Well, we have to be sensible.  If blocking crawlers only results in a
> small load reduction that isn't, IMHO, a good deal for our users.


I for example see also

66.249.71.59 - - [16/May/2011:13:37:58 +0000] "GET
/viewcvs?view=revision&revision=169814 HTTP/1.1" 200 1334 "-"
"Mozilla/5.0 (compatible; Googlebot/2.1;
+http://www.google.com/bot.html)" (35%) 2060117us

and viewvc is certainly even worse (from an I/O perspecive).  I thought
we blocked all bot traffic from the viewvc stuff ...

Richard.

> Andrew.
>

Re: Don't let search bots look at buglist.cgi

Reply via email to