Re: [Robots] Googlebot, msnbot, and robots.txt refresh

Walter Underwood Sun, 26 Mar 2006 08:57:00 -0800

--On March 26, 2006 7:25:42 AM -0800 [EMAIL PROTECTED] wrote:
> 
> Googlebot and msnbot are supposed to obey robots.txt, but they are ignoring
> my robots.txt ( http://simpy.com/robots.txt ), that contains:
> 
> User-agent: *
> Disallow: /simpy/
>
> User-agent: Googlebot
> Disallow: /rss/


You need to fix your robots.txt. Googlebot is dong the right thing.

Your robots.txt file tells Googlebot to stay away from /rss/, but it
does not say anything about /simpy/ (for Googlebot). Here is the spec
text about the meaning of "User-agent: *".

   If the value is '*', the record describes the default access policy
   for any robot that has not matched any of the other records.

In other words, the disallows following "User-agent:" are the entire
policy for one robot. The robot does not merge every matching block.
Under "User-agent: Googlebot" you must add all disallows for that bot.

If you want all robots to stay out of /simpy/, you must add that as a
Disallow line to every block.

Think of it like a switch statement. "User-agent: *" is the default label.
It wouldn't hurt to put that last in the file, just in case some lazy bot
takes the first match.

wunder
--
Walter Underwood
Principal Software Architect, Autonomy
_______________________________________________
Robots mailing list
Robots@mccmedia.com
http://www.mccmedia.com/mailman/listinfo/robots

Re: [Robots] Googlebot, msnbot, and robots.txt refresh

Reply via email to