On 06/29/2018 03:20 PM, Zoe Blade wrote:
> For anyone else who needs to do this, I adapted Sergey Svishchev's 1.8-era 
> patch for 19.1 (one of the few versions I managed to get to compile in OS X; 
> I'm on a Mac, and not the best programmer):
> 
> recur.c:578
> -  if (blacklist_contains (blacklist, url))
> +  if (blacklist_contains (blacklist, url) || !acceptable (url))
> 
> It's not ideal, but it seems to solve the problem as a temporary fix.  
> Hopefully it might help someone else who needs this functionality.

Hi Zoë,

we recently had a discussion (20.6.2018 "Why does -A not work") where I
confirmed that --reject-regex works like a filter for detected URLs.

BTW, the OP wanted --reject-regex to download+parse HTML (and delete
thereafter if matching the rejected regex) - so the opposite from your
request.

In Wget2 there is an extra option for this, --filter-urls. Maybe
--filter-mime-type is also worth a look.

Best would be if you can provide a small example / reproducer. It can
also be a hand-crafted HTML file.

Regards, Tim

Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to