Augustin, Stefan wrote:
> Hello,
> 
> I want to crawle a web site which uses
>  <meta name="robots" content="nofollow" />
> in the HTML HEAD,
> which should be XTHML instead of plain HTML.
> But wget seems to ignore this control information.
> 
> Unfortunately, I can't change the code in the HTML pages of this web server.

If I understand you correctly, I think you meant that "wget seems to
obey this control information", otherwise, what would be preventing you
from crawling a web site?

Have a look at
http://wget.addictivecode.org/FrequentlyAskedQuestions#robots for the
solution.

-- 
Micah J. Cowan
http://micah.cowan.name/


Reply via email to