Hello,

On Mon, Oct 14, 2013 at 09:25:24AM -0400, Sylvia wrote:
> Doesnt robots.txt "Crawl-Delay" directive satisfy your needs? 

I have it already there, but I don't know how long it takes for such a
directive, or any changes to robots.txt for that matter, to take effect.
Observing the logs, I'd say that this delay between changing robots.txt
and a change in robot behaviour would take several days, as I cannot see
any effects so far.

> Normal spiders should obey robots.txt, if they dont - they can be banned.

Banning Google is not a good idea, no matter how abusive they might be,
and they incidentically operate one of those robots which keep hammering
the site. I'd much prefer a technical solution to enforce such limits,
over convention.

I'd also like to limit the request frequency over an entire pool, so
that I can say "clients from this pool can make requests only with this
fequency, combined, not per client IP", because it doesn't buy me
anything if I can limit the individual search robot to a decent
frequency, but then get hammered by 1000 search robots in parallel, each
one observing the request limit. Right?


Kind regards,
--Toni++

_______________________________________________
nginx mailing list
nginx@nginx.org
http://mailman.nginx.org/mailman/listinfo/nginx

Reply via email to