* [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote:
> wget -e robots=off -r -N -k -E -p -H http://www.gnu.org/software/wget/
> 
> soon leads to non wget related links being downloaded, eg. 
> http://www.gnu.org/graphics/agnuhead.html

In that particular case, I think --no-parent would solve the
problem.

Maybe I misunderstood, though.  It seems awfully risky to use -r
and -H without having something to strictly limit the links
followed.  So, I suppose the content filter would be an effective
way to make cross-host downloading safer.

I think I'd prefer to have a different option, for that sort of
thing -- filter by using external programs.  If the program
returns a specific code, follow the link or recurse into the
links contained in the file.  Then you could do far more complex
filtering, including things like interactive pruning.


-- Scott

Reply via email to