Hi,
Schimon Jehudah schrieb am 27.08.25 um 09:19:
Function is at.
https://git.xmpp-it.net/sch/Rivista/src/branch/main/rivista/parser/xslt.py
Is this parsed as HTML? With which options?
Yes. I suppose so.
<xsl:output
encoding = 'UTF-8'
indent = 'yes'
media-type = 'text/xml'
method = 'html'
omit-xml-decleration='no'
version = '4.01' />
So, this is your Python code running the transformation:
def transform(filepath_xml, filepath_xslt):
tree = ET.parse(filepath_xml)
xslt_stylesheet = ET.parse(filepath_xslt)
xslt_transform = ET.XSLT(xslt_stylesheet)
newdom = xslt_transform(tree)
xml_data_bytes = ET.tostring(newdom, pretty_print=True)
xml_data_str = xml_data_bytes.decode("utf-8")
return xml_data_str
Since you're apparently using "<xsl:output>" to configure the output,
"tostring()" is the wrong way of serialising the result, because it does
not know about your XSLT output configuration. Instead, use e.g.
xml_data_bytes = memoryview(newdom)
xml_data_str = str(xml_data_bytes, 'UTF-8')
or, if you intend to write to a file:
newdom.write_output("somefile.xml")
You were using XML serialisation instead of HTML serialisation. That
certainly makes a difference.
If this doesn't solve your issue, I'd suggest trying to reproduce the
misbehaviour with the "xsltproc" program that comes with libxslt and if you
can make that show the same behaviour, report it to the libxslt project.
It's probably not lxml that's responsible here.
Stefan
_______________________________________________
lxml - The Python XML Toolkit mailing list -- [email protected]
To unsubscribe send an email to [email protected]
https://mail.python.org/mailman3//lists/lxml.python.org
Member address: [email protected]