Jörn Willhöft created FOP-3217:
----------------------------------
Summary: Invalid XMP stream with Saxon on classpath
Key: FOP-3217
URL: https://issues.apache.org/jira/browse/FOP-3217
Project: FOP
Issue Type: Bug
Components: renderer/pdf
Affects Versions: 2.10
Environment: FOP 2.10
Saxon 12.5 HE
OpenJDK 21
Fedora 41
Reporter: Jörn Willhöft
Attachments: PDFA3Xmp.xconf, PDFXMP.fo
FOP is generating an illegal XMP stream, when Saxon is in the classpath
resulting in invalid PDF/A3 files. I originally ran into this problem in
context of Apache Camel, but finally traced the issue back to FOP. Here is a
reproducible test procedure.
Please find the xconf and the static FO attached. I use the FOP binary package
from
[https://www.apache.org/dyn/closer.cgi?filename=/xmlgraphics/fop/binaries/fop-2.10-bin.tar.gz&action=download],
the current Saxon 12.5 HE and OpenJDK 21 on Fedora 41.
Create PDF *without* Saxon:
{noformat}
$ fop-2.10/fop/fop -c PDFA3Xmp.xconf -fo PDFXMP.fo -pdf output-ok.pdf
{noformat}
Create PDF *with* Saxon:
{noformat}
$ export
CLASSPATH=SaxonHE12-5J/saxon-he-12.5.jar:SaxonHE12-5J/lib/xmlresolver-5.2.2.jar:SaxonHE12-5J/lib/xmlresolver-5.2.2-data.jar:SaxonHE12-5J/lib/jline-2.14.6.jar
$ fop-2.10/fop/fop -c PDFA3Xmp.xconf -fo PDFXMP.fo -pdf output-saxon.pdf
{noformat}
Here the result *without* Saxon:
{noformat}
$ pdfinfo -meta output-ok.pdf
{noformat}
{code:xml}
...
<rdf:Description
xmlns:pdfaExtension="http://www.aiim.org/pdfa/ns/extension/" rdf:about="">
<pdfaExtension:schemas>
<rdf:Bag>
<rdf:li rdf:parseType="Resource">
<pdfaSchema:property
xmlns:pdfaSchema="http://www.aiim.org/pdfa/ns/schema#">
<rdf:Seq>
<rdf:li rdf:parseType="Resource">
<pdfaProperty:name
xmlns:pdfaProperty="http://www.aiim.org/pdfa/ns/property#">split</pdfaProperty:name>
</rdf:li>
</rdf:Seq>
</pdfaSchema:property>
</rdf:li>
</rdf:Bag>
</pdfaExtension:schemas>
</rdf:Description>
...
{code}
Here the result *with* Saxon:
{noformat}
$ pdfinfo -meta output-saxon.pdf
{noformat}
{code:xml}
...
<rdf:RDF xmlns:pdfaExtension="http://www.aiim.org/pdfa/ns/extension/"
rdf:about="">
<pdfaExtension:schemas>
<rdf:Bag>
<rdf:li rdf:parseType="Resource">
<pdfaSchema:property>
<rdf:Seq>
<rdf:li rdf:parseType="Resource">
<pdfaProperty:name>split</pdfaProperty:name>
</rdf:li>
</rdf:Seq>
</pdfaSchema:property>
</rdf:li>
</rdf:Bag>
</pdfaExtension:schemas>
</rdf:RDF>
...
{code}
Not only the namespace attributes xmlns:pdfaSchema and xmlns:pdfaProperty are
missing, but also the rdf:Description element is now called rdf:RDF (?!)
*Expected behavior:* the XMP stream should be technically equal, regardless if
Saxon or only the system's default XSLT transformer is available. In
particular, the result must be well-formed XML, including all used namespace
attributes.
*Possible workaround:* The JVM's default XSLT transformer can be set back to
the system default with a Java command line parameter:
{noformat}
-Djavax.xml.transform.TransformerFactory=com.sun.org.apache.xalan.internal.xsltc.trax.TransformerFactoryImpl"
{noformat}
*Caveat:* when Saxon is in the classpath there is probably a reason for this
and other parts of the application might expect Saxon to be the default. These
would now have to explicitly instantiate the Saxon transformer factory.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)