I really like wget but I'd like to make a suggestion for an improvement. Occasionally, I would like more nuanced control over the URLs that 'wget' downloads recursively.
Would it be a relatively simple to allow wget to take a filter argument which is some other executable. wget -r --filter my_bash_script http://www.somesite.com/index.html WGet would collect the first page and extract the list of links contained in the page. This list would then be passed to the filter via stdin and the filter would return a filtered ( or altered ) list via stdout. This filtered list would then be added to the list of URLs for wget to download. Tom