[
https://issues.apache.org/jira/browse/PDFBOX-2913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17937694#comment-17937694
]
Tilman Hausherr edited comment on PDFBOX-2913 at 11/13/25 11:26 AM:
--------------------------------------------------------------------
Wow this issue is now almost 10 years old. I've tried a few things over the
years but was never successful, but I should write down my thoughts /
observations.
Unlike other xmpbox changes I made over the years this won't be a few lines.
This rdf thing is partly supported but not as a schema. "<rdf:value>" isn't
supported at all.
[^xmp673189-ok.xml] is another file with "<rdf:value>" that doesn't fail. But
it doesn't work properly either, when debugging why this worked, I looked at
this part
{code:xml}
<desc:FileName rdf:parseType="Resource">
<rdf:value>E:\Pam_Ward\INS Forms-EB-2004\WIP
XFT\I-102_v5.xft</rdf:value>
<desc:ref>/template/subform[1]</desc:ref>
</desc:FileName>
{code}
and it returns "E:\Pam_Ward\INS Forms-EB-2004\WIP
XFT\I-102_v5.xft/template/subform[1]". This happens because this line is called
{code:java}
manageSimpleType(xmp, property, Types.Text, container);
{code}
If I delete the "<rdf:value>A</rdf:value>" it will still fail, because
xmpidq:Scheme isn't implemented.
It's mentioned here:
https://pdfa.org/wp-content/uploads/2011/08/tn0008_predefined_xmp_properties_in_pdfa-1_2008-03-20.pdf
I tried it to add it as a schema but this doesn't work, it has to be a
AbstractSimpleProperty.
I have a look at all the 250000 files if the digitalcorpora corpus, none of
them has xmpidq. We could try to implement it, but I'm not sure how, this isn't
a full schema, it's a single property. Implementing it as a property (similar
to the GPS property) made it fail elsewhere. Implementing a minimal schema file
also didn't help.
was (Author: tilman):
Wow this issue is now almost 10 years old. I've tried a few things over the
years but was never successful, but I should write down my thoughts /
observations.
Unlike other xmpbox changes I made over the years this won't be a few lines.
This rds thing is partly supported but not as a schema. "<rdf:value>" isn't
supported at all.
[^xmp673189-ok.xml] is another file with "<rdf:value>" that doesn't fail. But
it doesn't work properly either, when debugging why this worked, I looked at
this part
{code:xml}
<desc:FileName rdf:parseType="Resource">
<rdf:value>E:\Pam_Ward\INS Forms-EB-2004\WIP
XFT\I-102_v5.xft</rdf:value>
<desc:ref>/template/subform[1]</desc:ref>
</desc:FileName>
{code}
and it returns "E:\Pam_Ward\INS Forms-EB-2004\WIP
XFT\I-102_v5.xft/template/subform[1]". This happens because this line is called
{code:java}
manageSimpleType(xmp, property, Types.Text, container);
{code}
If I delete the "<rdf:value>A</rdf:value>" it will still fail, because
xmpidq:Scheme isn't implemented.
It's mentioned here:
https://pdfa.org/wp-content/uploads/2011/08/tn0008_predefined_xmp_properties_in_pdfa-1_2008-03-20.pdf
I tried it to add it as a schema but this doesn't work, it has to be a
AbstractSimpleProperty.
I have a look at all the 250000 files if the digitalcorpora corpus, none of
them has xmpidq. We could try to implement it, but I'm not sure how, this isn't
a full schema, it's a single property. Implementing it as a property (similar
to the GPS property) made it fail elsewhere. Implementing a minimal schema file
also didn't help.
> DomXmpParser fails on property containing qualifier
> ---------------------------------------------------
>
> Key: PDFBOX-2913
> URL: https://issues.apache.org/jira/browse/PDFBOX-2913
> Project: PDFBox
> Issue Type: Bug
> Components: XmpBox
> Affects Versions: 1.8.10
> Reporter: Petras
> Priority: Major
> Attachments: qualified_li.xmp, screenshot-1.png, xmp673189-ok.xml
>
>
> According to XMP specification properties may have qualifiers. In our
> scenario we used {{xmp:Identifier}} element from XMP Basic Schema holding an
> array of text strings. An array item may be qualified with {{xmpidq:Scheme}}:
> {code:xml}
> <rdf:Description rdf:about=""
> xmlns:xmp="http://ns.adobe.com/xap/1.0/"
> xmlns:xmpidq="http://ns.adobe.com/xmp/Identifier/qual/1.0/">
> <xmp:Identifier>
> <rdf:Bag>
> <rdf:li rdf:parseType="Resource">
> <rdf:value>A</rdf:value>
> <xmpidq:Scheme>http://archyvai.lt/pdf-ltud/2013/level/</xmpidq:Scheme>
> </rdf:li>
> </rdf:Bag>
> </xmp:Identifier>
> </rdf:Description>
> {code}
> {{DomXmpParser}} fails when parsing XMP containing such qualifiers:
> {code}
> org.apache.xmpbox.xml.XmpParsingException: Schema is not set in this document
> : http://www.w3.org/1999/02/22-rdf-syntax-ns#
> at
> org.apache.xmpbox.xml.DomXmpParser.checkPropertyDefinition(DomXmpParser.java:787)
> at
> org.apache.xmpbox.xml.DomXmpParser.parseLiDescription(DomXmpParser.java:508)
> at
> org.apache.xmpbox.xml.DomXmpParser.parseLiElement(DomXmpParser.java:449)
> at org.apache.xmpbox.xml.DomXmpParser.manageArray(DomXmpParser.java:407)
> at
> org.apache.xmpbox.xml.DomXmpParser.createProperty(DomXmpParser.java:309)
> at
> org.apache.xmpbox.xml.DomXmpParser.parseDescriptionRoot(DomXmpParser.java:267)
> at org.apache.xmpbox.xml.DomXmpParser.parse(DomXmpParser.java:199)
> at
> org.apache.xmpbox.TestXMPWithDefinedSchemas.main(TestXMPWithDefinedSchemas.java:66)
> ...
> {code}
> It appears it failed on {{rdf:value}} element as
> {{org.apache.xmpbox.type.TypeMapping}} class is not aware about
> {{http://www.w3.org/1999/02/22-rdf-syntax-ns#}} standard namespace.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]