Dave,
That's not what I'm finding. If you have a robots.txt file that says:
disallow /search.cfm
It will not index the search.cfm file from the root of the server. But I cannot find anywhere where you can put in
something like this:
disallow http://www.someothersite.com
You see what I mean? The robot.txt file allows you to exclude pages on THIS site that you don't want indexed.
-Mark
-----Original Message-----
From: Dave Watts [mailto:[EMAIL PROTECTED]
Sent: Sunday, April 04, 2004 5:23 PM
To: CF-Talk
Subject: RE: user agent checking and spidering...
> Sequelink (the access service for Jrun I think) locks up
> quickly trying to service hundreds of requests at once to
> the same access file.
As a short-term fix, have you considered a more aggressive caching strategy?
That might be pretty easy to implement.
> Each site has a pretty well thought out robots.txt file, but
> it doesn't help because the links in question are to external
> sites - not pages on THIS site (even though these external
> sites are virtuals on the same server).
I don't think I understand this. It shouldn't matter whether the links are
internal or external - before a well-written spider requests the link, it
should check that server's robots.txt file first.
Dave Watts, CTO, Fig Leaf Software
http://www.figleaf.com/
phone: 202-797-5496
fax: 202-797-5444
[Todays Threads]
[This Message]
[Subscription]
[Fast Unsubscribe]
[User Settings]
- user agent checking and spidering... Mark A. Kruger - CFG
- RE: user agent checking and spidering... Jim Davis
- RE: user agent checking and spidering... Mark A. Kruger - CFG
- RE: user agent checking and spidering... Jim Davis
- Re: user agent checking and spidering... Jochem van Dieten
- RE: user agent checking and spidering... Dave Watts
- Re: user agent checking and spidering... Mark A. Kruger - CFG
- Re: user agent checking and spidering... Stephen Moretti
- RE: user agent checking and spiderin... Mark A. Kruger - CFG
- RE: user agent checking and spidering... Mark A. Kruger - CFG
- RE: user agent checking and spidering... Dave Watts