Thanks I will.

Giving a final update here for those that may want to use the code above, don't 
trust the results. I've just tested this against a web crawler I've been using 
for quite a while now and in many cases there's still a breakage, leaving some 
results out.

Until the parser can repair broken html I recommend looking into using a 
webdriver and querying the dom there. 
[https://github.com/dom96/webdriver](https://github.com/dom96/webdriver) is a 
good starting point.

Reply via email to