Oliver Schmidtmer created PDFBOX-5835:
-----------------------------------------

             Summary: DomXmpParser - IllegalArgumentException: prefix cannot be 
"null" when creating a QName
                 Key: PDFBOX-5835
                 URL: https://issues.apache.org/jira/browse/PDFBOX-5835
             Project: PDFBox
          Issue Type: Bug
          Components: XmpBox
    Affects Versions: 3.0.2 PDFBox
            Reporter: Oliver Schmidtmer


I've got a PDF from, where parsing the metadata fails with an 
IllegalArgumentException
{code:java}
java.lang.IllegalArgumentException: prefix cannot be "null" when creating a 
QName
        at java.xml/javax.xml.namespace.QName.<init>(QName.java:192)
        at org.apache.xmpbox.xml.DomHelper.getQName(DomHelper.java:99)
        at 
org.apache.xmpbox.xml.DomXmpParser.parseChildrenAsProperties(DomXmpParser.java:306)
        at 
org.apache.xmpbox.xml.DomXmpParser.parseDescriptionRoot(DomXmpParser.java:250)
        at org.apache.xmpbox.xml.DomXmpParser.parse(DomXmpParser.java:201)
        at org.apache.xmpbox.xml.DomXmpParser.parse(DomXmpParser.java:112)
{code}
This can be reproduced with a simple test, using the extracted metadata:
{code:java}
    @Test
    void testDomXmpParser() throws XmpParsingException
    {
        // taken from file test-landscape2.pdf
        String xmpmeta = "<?xml version=\"1.0\" encoding=\"UTF-8\" 
standalone=\"no\"?>\n" +
                "<?xpacket begin=\"\uFEFF\" 
id=\"W5M0MpCehiHzreSzNTczkc9d\"?><x:xmpmeta xmlns:x=\"adobe:ns:meta/\" 
x:xmptk=\"FIS/xee\">\n" +
                " <rdf:RDF 
xmlns:rdf=\"http://www.w3.org/1999/02/22-rdf-syntax-ns#\";>\n" +
                " <rdf:Description 
xmlns:pdfaid=\"http://www.aiim.org/pdfa/ns/id/\";>\n" +
                "   <pdfaid:part>3</pdfaid:part>\n" +
                "   <pdfaid:conformance>A</pdfaid:conformance>\n" +
                "  </rdf:Description>\n" +
                "  <rdf:Description 
xmlns:pdfaExtension=\"http://www.aiim.org/pdfa/ns/extension/\"; 
xmlns:pdfaField=\"http://www.aiim.org/pdfa/ns/field#\"; 
xmlns:pdfaProperty=\"http://www.aiim.org/pdfa/ns/property#\"; 
xmlns:pdfaSchema=\"http://www.aiim.org/pdfa/ns/schema#\"; 
xmlns:pdfaType=\"http://www.aiim.org/pdfa/ns/type#\"; rdf:about=\"\"/>\n" +
                "  <rdf:Description>\n" +
                "   <schemas 
xmlns=\"http://www.aiim.org/pdfa/ns/extension/\";>\n" +
                "    <rdf:Bag>\n" +
                "     <rdf:li rdf:parseType=\"Resource\">\n" +
                "      <schema 
xmlns=\"http://www.aiim.org/pdfa/ns/schema#\";>ZUGFeRD PDFA Extension 
Schema</schema>\n" +
                "      <namespaceURI 
xmlns=\"http://www.aiim.org/pdfa/ns/schema#\";>urn:ferd:pdfa:CrossIndustryDocument:invoice:1p0#</namespaceURI>\n"
 +
                "      <prefix 
xmlns=\"http://www.aiim.org/pdfa/ns/schema#\";>zf</prefix>\n" +
                "      <property 
xmlns=\"http://www.aiim.org/pdfa/ns/schema#\";>\n" +
                "       <rdf:Seq>\n" +
                "        <rdf:li rdf:parseType=\"Resource\">\n" +
                "         <name 
xmlns=\"http://www.aiim.org/pdfa/ns/property#\";>DocumentFileName</name>\n" +
                "         <valueType 
xmlns=\"http://www.aiim.org/pdfa/ns/property#\";>Text</valueType>\n" +
                "         <category 
xmlns=\"http://www.aiim.org/pdfa/ns/property#\";>external</category>\n" +
                "         <description 
xmlns=\"http://www.aiim.org/pdfa/ns/property#\";>name of the embedded XML 
invoice file</description>\n" +
                "        </rdf:li>\n" +
                "        <rdf:li rdf:parseType=\"Resource\">\n" +
                "         <name 
xmlns=\"http://www.aiim.org/pdfa/ns/property#\";>DocumentType</name>\n" +
                "         <valueType 
xmlns=\"http://www.aiim.org/pdfa/ns/property#\";>Text</valueType>\n" +
                "         <category 
xmlns=\"http://www.aiim.org/pdfa/ns/property#\";>external</category>\n" +
                "         <description 
xmlns=\"http://www.aiim.org/pdfa/ns/property#\";>INVOICE</description>\n" +
                "        </rdf:li>\n" +
                "        <rdf:li rdf:parseType=\"Resource\">\n" +
                "         <name 
xmlns=\"http://www.aiim.org/pdfa/ns/property#\";>Version</name>\n" +
                "         <valueType 
xmlns=\"http://www.aiim.org/pdfa/ns/property#\";>Text</valueType>\n" +
                "         <category 
xmlns=\"http://www.aiim.org/pdfa/ns/property#\";>external</category>\n" +
                "         <description 
xmlns=\"http://www.aiim.org/pdfa/ns/property#\";>The actual version of the 
ZUGFeRD data</description>\n" +
                "        </rdf:li>\n" +
                "        <rdf:li rdf:parseType=\"Resource\">\n" +
                "         <name 
xmlns=\"http://www.aiim.org/pdfa/ns/property#\";>ConformanceLevel</name>\n" +
                "         <valueType 
xmlns=\"http://www.aiim.org/pdfa/ns/property#\";>Text</valueType>\n" +
                "         <category 
xmlns=\"http://www.aiim.org/pdfa/ns/property#\";>external</category>\n" +
                "         <description 
xmlns=\"http://www.aiim.org/pdfa/ns/property#\";>The conformance level of the 
ZUGFeRD data</description>\n" +
                "        </rdf:li>\n" +
                "       </rdf:Seq>\n" +
                "      </property>\n" +
                "     </rdf:li>\n" +
                "    </rdf:Bag>\n" +
                "   </schemas>\n" +
                "  </rdf:Description>\n" +
                "  <rdf:Description 
xmlns:zf=\"urn:ferd:pdfa:CrossIndustryDocument:invoice:1p0#\" rdf:about=\"\" 
zf:ConformanceLevel=\"EXTENDED\" zf:DocumentFileName=\"ZUGFeRD-invoice.xml\" 
zf:DocumentType=\"INVOICE\" zf:Version=\"1.0\"/>\n" +
                " </rdf:RDF>\n" +
                "</x:xmpmeta><?xpacket end=\"w\"?>\n";
        DomXmpParser xmpParser = new DomXmpParser();
        xmpParser.setStrictParsing(false);
        XMPMetadata xmp = xmpParser.parse(xmpmeta.getBytes());
    }
{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org

Reply via email to