[EMAIL PROTECTED] wrote:
On Thu, 23 Mar 2006 15:15:00 GMT, Dave Korn said:
difference? robots.txt is enforced (or ignored) by the client. If a
server
returns a 403 or doesn't, depending on what UserAgent you specified,
then
how could making the client ignore robots.txt somehow magically make the
server not return a 403 when you try to fetch a page?
It *can*, however, make the client *issue* a request it would otherwise
not have.
If the client had snarfed a robots.txt, and it said don't snarf anything
under /dontlook/here/, and a link pointed there. it wouldn't follow the
link.
If you tell it 'robots=off', then it *would* follow the link.
Yes, these are all extremely obvious truisms, but I think now you need to
go back and read the thread, because you haven't noticed that they're
utterly irrelevant to the matter at hand.
Remember - robots.txt *isn't* for the pages that would 403.
See, thing is, pages that would 403 is /exactly/ what we were talking
about. So saying switch off robots.txt is a completely irrelevant
response. And the fact that doing so _would_ have /an/ effect in /other/
circumstances doesn't make it any less irrelevant, at least not according to
any definition of the word relevant that I've ever seen!
cheers,
DaveK
--
Can't think of a witty .sigline today
___
Full-Disclosure - We believe in it.
Charter: http://lists.grok.org.uk/full-disclosure-charter.html
Hosted and sponsored by Secunia - http://secunia.com/