Re: [Bug-wget] robots.txt not working

Micah Cowan Fri, 16 Mar 2012 23:38:53 -0700

I think you're misunderstanding what was supposed to happen.

The robots.txt file is only followed for links that wget is
automatically following. This means (a) wget has to be in
recursive-descent mode (-r or -m), and (b) it only applies to links that
weren't explicitly requested by the user. In other words, it applies
only to links that wget is actually robotting on.


Hope that helps.

-mjc

On 03/16/2012 01:04 PM, phil curb wrote:
> I just tried creating a web server locally.
> |I tried creating a web server locally  putting robots.txt in there and using 
> wget  and it didn't work
> 
> 
> 
> http://pastebin.com/raw.php?i=kt1mV2af 
> 
> 
> C:\r>wget 127.0.0.1:56
> ....
> 2012-03-16 19:45:32 (20.0 KB/s) - `index.html' saved [3/3] C:\r>wget 
> 127.0.0.1:56/robots.txt
> ....
> 2012-03-16 19:45:43 (175 KB/s) - `robots.txt' saved [26/26] C:\r>type 
> robots.txt
> User-agent: *
> Disallow: /
> C:\r>

Re: [Bug-wget] robots.txt not working

Reply via email to