On Friday, 9 January 2015 at 17:18:43 UTC, Adam D. Ruppe wrote:
Huh, looking at the answers on the website, they're mostly using regular expressions. Weaksauce. And wrong - they don't find ALL the links, they find the absolute HTTP urls!

Yeah... Surprising, since languages like python includes a HTML parser in the standard library.

Besides, if you want all resource links you have to do a lot better, since the following attributes can contain resource addresses: href, src, data, cite, xlink:href…

You also need to do entity expansion since the links can contain html entities like "&".

Depressing.

Reply via email to