Why don't you just update your robots.txt to explicitly specify which
files you don't or do, allow spiders access to. If it's a rule-obiding
spider, that will be the end of it.

On Sun, Dec 23, 2001 at 05:41:47PM +0100, Russell Coker wrote:
> I have a nasty web spider with an agent name of "LinkWalker" downloading 
> everything on my site (including .tgz files).  Does anyone know anything 
> about it?
> 
> I've added the following to my firewall setup to stop further attacks...
> 
> # crappy LinkWalker - evil spider that downloads every file including .tgz on
> # the site
> iptables -A INPUT -j logitrej -p tcp -s 209.167.50.25 -d 0.0.0.0/0 --dport www
> 
> -- 
> http://www.coker.com.au/bonnie++/     Bonnie++ hard drive benchmark
> http://www.coker.com.au/postal/       Postal SMTP/POP benchmark
> http://www.coker.com.au/projects.html Projects I am working on
> http://www.coker.com.au/~russell/     My home page
> 
> 
> -- 
> To UNSUBSCRIBE, email to [EMAIL PROTECTED]
> with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]
> 
> 

-- 
  Nick Jennings


Reply via email to