Parsing HTML with xml.etree in Python 2.7?

Skip Montanaro Mon, 05 Oct 2015 07:16:26 -0700

Back before Fredrik Lundh's elementtree module was sucked into the Python
stdlib as xml.etree, I used to use his elementtidy extension module to
clean up HTML source so it could be parsed into an ElementTree object.
Elementtidy hasn't be updated in about ten years, and still assumes there
is a module named "elementtree" which it can import.


I wouldn't be surprised if there were some small API changes other than the
name change caused by the move into the xml package. Before I dive into a
rabbit hole and start to modify elementtidy, is there some other
stdlib-only way to parse HTML code into an xml.etree.ElementTree?

Thx,

Skip

-- 
https://mail.python.org/mailman/listinfo/python-list

Parsing HTML with xml.etree in Python 2.7?

Reply via email to