* [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote: > wget -e robots=off -r -N -k -E -p -H http://www.gnu.org/software/wget/ > > soon leads to non wget related links being downloaded, eg. > http://www.gnu.org/graphics/agnuhead.html
In that particular case, I think --no-parent would solve the problem. Maybe I misunderstood, though. It seems awfully risky to use -r and -H without having something to strictly limit the links followed. So, I suppose the content filter would be an effective way to make cross-host downloading safer. I think I'd prefer to have a different option, for that sort of thing -- filter by using external programs. If the program returns a specific code, follow the link or recurse into the links contained in the file. Then you could do far more complex filtering, including things like interactive pruning. -- Scott