On Wed, Jul 2, 2008 at 2:10 PM, Nathan Yergler
<[EMAIL PROTECTED]> wrote:
> Were having a few issues with aggressive crawlers slowing the machine
> down.

gee, I wonder what that's like ;)

I finally fixed all that will a little snippet of code at the bottom
of every page on ccM with a hidden link marked as 'nofollow' and
explicitly listed in robots.txt (I use javascript to replace the href
with nothing just in case) The link is unique with a rand() parameter
so the bots don't 'learn' to avoid it.

If you follow the link, it leads to a php script that enters your IP
into the deny list in .htaccess. This would stop crawls right as they
started but allows well-behaved crawls from google, yahoo, etc. to
continue. We used to go down once a week, now, pretty much never. We
collect about 50-100 IPs per week. I clean them out regularly because
the evil bots burn the IPs anyway.

VS
_______________________________________________
cc-devel mailing list
[email protected]
http://lists.ibiblio.org/mailman/listinfo/cc-devel

Reply via email to