On 13/05/2022 12:51, Xavier Morel wrote:
You're parsing an HTML document. An HTML document necessarily has a root <html> and a body, so that's part of the error recovery of HTML parsers.

If you don't want to parse an HTML document, you should probably use `fragment_fromstring`.

Thanks for the pointer. lxml doesn't like it either, but I'll read up on that function I didn't know about.

=======
with open("block1.html") as reader:
    block = reader.read()

from lxml import html
#lxml.etree.ParserError: Multiple elements found (div, div)
tree = html.fragment_fromstring(block)

print(html.tostring(tree))
=======
_______________________________________________
lxml - The Python XML Toolkit mailing list -- [email protected]
To unsubscribe send an email to [email protected]
https://mail.python.org/mailman3/lists/lxml.python.org/
Member address: [email protected]

Reply via email to