WGET and the robots.txt file...

Jon W. Backstrom Wed, 11 Sep 2002 13:24:59 -0700

Dear Gnu Developers,

We just ran into a situation where we had to "spider" a site of our
own on a outsourced service because the company was going out of
business.  Because wget respects the robots.txt file, however, we
could not get an archive made until we had the outsourced company
delete their robots.txt file on the server in the last 2 days of
service.  This might not have had such a happy ending, however.


Would you consider an "--ignore-robotfile" option perhaps, or would
that be too abusive?  I know I can always edit the source and make my
own, but I wondered if this was something that WGET might want to do
in a release version or is the potential for abuse too great?

Thank you!

Jon Backstrom
[EMAIL PROTECTED]

WGET and the robots.txt file...

Reply via email to