I'm using HTMLParser.py to parse XHTML and invalid tag is throwing an exception. How do I handle this?
1. Below is the faulty markup. Notice the missing >. Both Firefox and IE6 correct automatically but HTMLParser is less forgiving. My code has to be able to treat this gracefully because I don't have control over the XHTML source.
###/ <A NAME='anchor'</a> /###
what you want is to encapsulate check_for_whole_start_tag() in your own parser class. Call the parent's version, if it returns -1, do some fuzzy logic (well, maybe if I add a '>', now does it work?), and continue.
_______________________________________________
Tutor maillist - Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor