Jörn Willhöft created FOP-3217:
----------------------------------

             Summary: Invalid XMP stream with Saxon on classpath
                 Key: FOP-3217
                 URL: https://issues.apache.org/jira/browse/FOP-3217
             Project: FOP
          Issue Type: Bug
          Components: renderer/pdf
    Affects Versions: 2.10
         Environment: FOP 2.10
Saxon 12.5 HE
OpenJDK 21
Fedora 41
            Reporter: Jörn Willhöft
         Attachments: PDFA3Xmp.xconf, PDFXMP.fo

FOP is generating an illegal XMP stream, when Saxon is in the classpath 
resulting in invalid PDF/A3 files. I originally ran into this problem in 
context of Apache Camel, but finally traced the issue back to FOP.  Here is a 
reproducible test procedure.

Please find the xconf and the static FO attached. I use the FOP binary package 
from 
[https://www.apache.org/dyn/closer.cgi?filename=/xmlgraphics/fop/binaries/fop-2.10-bin.tar.gz&action=download],
 the current Saxon 12.5 HE and OpenJDK 21 on Fedora 41.

Create PDF *without* Saxon:
{noformat}
$ fop-2.10/fop/fop -c PDFA3Xmp.xconf -fo PDFXMP.fo -pdf output-ok.pdf
{noformat}
Create PDF *with* Saxon:
{noformat}
$ export 
CLASSPATH=SaxonHE12-5J/saxon-he-12.5.jar:SaxonHE12-5J/lib/xmlresolver-5.2.2.jar:SaxonHE12-5J/lib/xmlresolver-5.2.2-data.jar:SaxonHE12-5J/lib/jline-2.14.6.jar
$ fop-2.10/fop/fop -c PDFA3Xmp.xconf -fo PDFXMP.fo -pdf output-saxon.pdf
{noformat}
Here the result *without* Saxon:
{noformat}
$ pdfinfo -meta output-ok.pdf
{noformat}
{code:xml}
...
       <rdf:Description 
xmlns:pdfaExtension="http://www.aiim.org/pdfa/ns/extension/"; rdf:about="">
           <pdfaExtension:schemas>
               <rdf:Bag>
                   <rdf:li rdf:parseType="Resource">
                       <pdfaSchema:property 
xmlns:pdfaSchema="http://www.aiim.org/pdfa/ns/schema#";>
                           <rdf:Seq>
                               <rdf:li rdf:parseType="Resource">
                                   <pdfaProperty:name 
xmlns:pdfaProperty="http://www.aiim.org/pdfa/ns/property#";>split</pdfaProperty:name>
                               </rdf:li>
                           </rdf:Seq>
                       </pdfaSchema:property>
                   </rdf:li>
               </rdf:Bag>
           </pdfaExtension:schemas>
       </rdf:Description>
...
{code}
 

Here the result *with* Saxon:
{noformat}
$ pdfinfo -meta output-saxon.pdf
{noformat}
{code:xml}
...
     <rdf:RDF xmlns:pdfaExtension="http://www.aiim.org/pdfa/ns/extension/"; 
rdf:about="">
        <pdfaExtension:schemas>
           <rdf:Bag>
              <rdf:li rdf:parseType="Resource">
                 <pdfaSchema:property>
                    <rdf:Seq>
                       <rdf:li rdf:parseType="Resource">
                          <pdfaProperty:name>split</pdfaProperty:name>
                       </rdf:li>
                    </rdf:Seq>
                 </pdfaSchema:property>
              </rdf:li>
           </rdf:Bag>
        </pdfaExtension:schemas>
     </rdf:RDF>
...
{code}
Not only the namespace attributes xmlns:pdfaSchema and xmlns:pdfaProperty are 
missing, but also the rdf:Description element is now called rdf:RDF (?!)

 

*Expected behavior:* the XMP stream should be technically equal, regardless if 
Saxon or only the system's default XSLT transformer is available. In 
particular, the result must be well-formed XML, including all used namespace 
attributes.

*Possible workaround:* The JVM's default XSLT transformer can be set back to 
the system default with a Java command line parameter:

{noformat}
-Djavax.xml.transform.TransformerFactory=com.sun.org.apache.xalan.internal.xsltc.trax.TransformerFactoryImpl"
{noformat}

*Caveat:* when Saxon is in the classpath there is probably a reason for this 
and other parts of the application might expect Saxon to be the default. These 
would now have to explicitly instantiate the Saxon transformer factory.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to