At 02:09 PM 10/18/2004, Michel Rijnders wrote:
The following article ('No Bots Allowed!'):
http://www.eweek.com/article2/0,1759,1248105,00.asp
might be of interest as well; it also states the point I was trying to
get across:

  However, the standard relies entirely on the courtesy of the visiting
  robot. It's completely optional. Nothing prevents robots from simply
  ignoring the directives in a robots.txt file  and many robots do just
  that. In that sense, a robots.txt file is less like a locked door than a
  "no entry" sign hanging in an open doorway.

there are, of course, better ways.

-you can block certain user agents, or all except certain user agents(bots can get around this by using a different user agent string) -you can block certain ip blocks (this requires knowledge of which ip blocks the bot goes in) -you can put an image-based challenge/response thingy (you present garbled image, human user enters the word it represents, bots fail) but these can be fooled by bots with some trickery involving naughty web sites

--
unsigned short int to_yer_mama;
matt kane's brain
http://www.hydrogenproject.com
[EMAIL PROTECTED] || [EMAIL PROTECTED]

Reply via email to