I have the following wget command line:
wget -r http://wwwdev.nber.org/
http://wwwdev.nber.org/robots.txt is:
User-agent: *
Disallow: /
User-Agent: W3C-checklink
Disallow:
However wget fetches thousands of pages from wwwdev.nber.org. I would have
thought nothing would be found. (This is a demonstration, obviously in
real life I'd have a more detailed robots.txt to control the process).
Obviously too, I don't understand something about wget or robots.txt. Can
anyone help me out?
This is GNU Wget 1.12 built on linux-gnu.
Thank you
Daniel Feenberg