> That's not what I'm finding.  If you have a robots.txt file
> that says:
>
> disallow /search.cfm
>
> It will not index the search.cfm file from the root of the
> server. But I cannot find anywhere where you can put in
> something like this:
>
> disallow http://www.someothersite.com
>
>
> You see what I mean? The robot.txt file allows you to exclude
> pages on THIS site that you don't want indexed.

No, you can't use robots.txt to disallow spidering of other sites. However,
if your site has a link to www.someothersite.com, a well-written spider
should request http://www.someothersite.com/robots.txt before requesting
that URL.

Dave Watts, CTO, Fig Leaf Software
http://www.figleaf.com/
phone: 202-797-5496
fax: 202-797-5444
[Todays Threads] [This Message] [Subscription] [Fast Unsubscribe] [User Settings]

Reply via email to