Not all of them are going to be well behaved web spiders. Exploit scanners tend to hit specific URL suffixes to feed in their exploit code (looking for vulnerable phpbb, phpnuke, etc), and they don't respond to robots.txt ;)
It clogs up traditional Apache error logs as well. I would suggest simply filtering the error emails. jake123 wrote:
Hi, we have a similar problem... we are hosting approximately 300 websites that is using our tapestry application to which all the content are red from the database and build up on the fly. We also gets a lot of 'ghost' exceptions when search engine spiders and robots try to access our application. Our application sends us a error email every time an exception occurs in the application and that means at least around a 100 emails a day. I also noticed that we get a lot of pageNotFoundException for page names that do not exists in our application name space... is this normal? How do you prevent the search engines to do this? Thanks in advance for any help, Jacob
--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]