[ https://issues.apache.org/jira/browse/PDFBOX-2610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14284363#comment-14284363 ]
Tilman Hausherr edited comment on PDFBOX-2610 at 1/20/15 8:36 PM: ------------------------------------------------------------------ [~msahyoun] Going back to the file good0002.pdf from PDFBOX-2416 - it brings the error {quote} 7.1 : Error on MetaData, Invalid array definition, expecting Bag and found com.sun.org.apache.xerces.internal.dom.DeferredTextImpl [prefix=photoshop; name=SupplementalCategories] {quote} The cause is this part of the XMP: {code} <rdf:Description rdf:about="" xmlns:photoshop="http://ns.adobe.com/photoshop/1.0/"> <photoshop:Headline>Photoshop Headline</photoshop:Headline> <photoshop:SupplementalCategories>Category 1, Category 2</photoshop:SupplementalCategories> </rdf:Description> {code} In PDFBOX-2416, you wrote: {quote} I’d think that this is more related to how namespaces are handled in PDFBox XMP. The original issue could be a hint for that. According to the PDF/A and XMP specification the file should be validated OK. {quote} My understanding from reading the XMP specifications (part 1 and 2) is that SupplementalCategories is an "unordered array of Text" and should be like this one: {code} <photoshop:SupplementalCategories> <rdf:Bag> <rdf:li>nnn-nnn-nnnn</rdf:li> <rdf:li>xx...@yyy.zz</rdf:li> <rdf:li>unclassified</rdf:li> </rdf:Bag> </photoshop:SupplementalCategories> {code} Can you explain why you think it is OK? PDF Tools also thinks it is OK. was (Author: tilman): Going back to the file good0002.pdf from PDFBOX-2416 - it brings the error {quote} 7.1 : Error on MetaData, Invalid array definition, expecting Bag and found com.sun.org.apache.xerces.internal.dom.DeferredTextImpl [prefix=photoshop; name=SupplementalCategories] {quote} The cause is this part of the XMP: {code} <rdf:Description rdf:about="" xmlns:photoshop="http://ns.adobe.com/photoshop/1.0/"> <photoshop:Headline>Photoshop Headline</photoshop:Headline> <photoshop:SupplementalCategories>Category 1, Category 2</photoshop:SupplementalCategories> </rdf:Description> {code} In PDFBOX-2416, you wrote: {quote} I’d think that this is more related to how namespaces are handled in PDFBox XMP. The original issue could be a hint for that. According to the PDF/A and XMP specification the file should be validated OK. {quote} My understanding from reading the XMP specifications (part 1 and 2) is that SupplementalCategories is an "unordered array of Text" and should be like this one: {quote} <photoshop:SupplementalCategories> <rdf:Bag> <rdf:li>nnn-nnn-nnnn</rdf:li> <rdf:li>xx...@yyy.zz</rdf:li> <rdf:li>unclassified</rdf:li> </rdf:Bag> </photoshop:SupplementalCategories> {quote} Can you explain why you think it is OK? PDF Tools also thinks it is OK. > Expand Isartor test for Bavaria test suite and other tests > ---------------------------------------------------------- > > Key: PDFBOX-2610 > URL: https://issues.apache.org/jira/browse/PDFBOX-2610 > Project: PDFBox > Issue Type: Task > Components: Preflight > Affects Versions: 2.0.0 > Reporter: Tilman Hausherr > Assignee: Tilman Hausherr > > 1) Expand the isartor test code so that it can also check conforming > documents, i.e. documents that should not bring any errors. Support JBIG2. > 2) Test the files from the Bavaria suite with preflight. I'll create > sub-issues on that one. I counted 16 where something doesn't work as intented. > 3) Include the Bavaria tests in the build. Only if we agree on this one. If > not, I'll just keep it for myself as an additional regression test. -- This message was sent by Atlassian JIRA (v6.3.4#6332)