On 13/05/2022 12:51, Xavier Morel wrote:
You're parsing an HTML document. An HTML document necessarily has a
root <html> and a body, so that's part of the error recovery of HTML
parsers.
If you don't want to parse an HTML document, you should probably use
`fragment_fromstring`.
Thanks for the pointer. lxml doesn't like it either, but I'll read up on
that function I didn't know about.
=======
with open("block1.html") as reader:
block = reader.read()
from lxml import html
#lxml.etree.ParserError: Multiple elements found (div, div)
tree = html.fragment_fromstring(block)
print(html.tostring(tree))
=======
_______________________________________________
lxml - The Python XML Toolkit mailing list -- [email protected]
To unsubscribe send an email to [email protected]
https://mail.python.org/mailman3/lists/lxml.python.org/
Member address: [email protected]