Hi Brian & Charlie, I'm not the OP; but, FYI, i can see the same issue (on an Intel Mac):
aid@orac tmp % ./tail.py Python : sys.version_info(major=3, minor=9, micro=13, releaselevel='final', serial=0) lxml.etree : (4, 9, 0, 0) libxml used : (2, 9, 14) libxml compiled : (2, 9, 14) libxslt used : (1, 1, 35) libxslt compiled : (1, 1, 35) b'<form action="action1">\n</form>\n</body>\n</html>\n' You can see my machine is using lxml 2.9.14; which is a pity as in the thread you linked to it looked like the issue would have been resolved in that version... However, I found that if you update the call to etree.tostring() to use method='html' then the trailing body and html elements are no longer shown. i.e.: print(etree.tostring(nodeList[0], method='html')) With that update made, the script outputs the desired: aid@orac tmp % python3 -i tail.py Python : sys.version_info(major=3, minor=9, micro=13, releaselevel='final', serial=0) lxml.etree : (4, 9, 0, 0) libxml used : (2, 9, 14) libxml compiled : (2, 9, 14) libxslt used : (1, 1, 35) libxslt compiled : (1, 1, 35) b'<form action="action1">\n</form>\n' I've no idea why this behaviour seems to have changed.... Kind regards aid > On 7 Jun 2022, at 17:02, Charlie Clark <[email protected]> > wrote: > > On 7 Jun 2022, at 16:56, [email protected] > <mailto:[email protected]> wrote: > > In more recent versions of lxml the tostring() method can return extra text > after the closing tag of the node I've passed to it. So instead of returning > > b'\n\n' > > it returns > > b'\n\n\n\n' > > This looks a lot like this > https://mail.python.org/archives/list/[email protected]/thread/LCTOSIIWGGALAMSZAYHRRYUWYDRESCUO/ > > <https://mail.python.org/archives/list/[email protected]/thread/LCTOSIIWGGALAMSZAYHRRYUWYDRESCUO/> > Can you update your version of libxml2? > > Charlie > > -- > Charlie Clark > Managing Director > Clark Consulting & Research > German Office > Sengelsweg 34 > Düsseldorf > D- 40489 > Tel: +49-203-3925-0390 > Mobile: +49-178-782-6226 > > _______________________________________________ > lxml - The Python XML Toolkit mailing list -- [email protected] > To unsubscribe send an email to [email protected] > https://mail.python.org/mailman3/lists/lxml.python.org/ > Member address: [email protected]
_______________________________________________ lxml - The Python XML Toolkit mailing list -- [email protected] To unsubscribe send an email to [email protected] https://mail.python.org/mailman3/lists/lxml.python.org/ Member address: [email protected]
