>From my perspective, it would be great to have a general xmp parser that also >allows for some variance from spec (PDFBOX-2855). We've been using jempbox >for pdfs as well as images over on Tika, and it has worked well for us.
I'd prefer to continue using your xmp parser, but I understand if you need to limit what you're willing to take on. I'll take a look at xmlgraphics, and I'll discuss the fallback option with Tika devs about moving jempbox into Tika. Thank you. Cheers, Tim -----Original Message----- From: Maruan Sahyoun [mailto:sahy...@fileaffairs.de] Sent: Thursday, July 09, 2015 4:56 AM To: users@pdfbox.apache.org Subject: Re: DomXmpParser: namespace not found Hi, > Am 08.07.2015 um 22:42 schrieb Tilman Hausherr <thaush...@t-online.de>: > > Am 08.07.2015 um 17:22 schrieb Allison, Timothy B.: >> All, >> Apologies for the idiocy I'm about to reveal (well, that won't be a >> revelation to anyone, really), but is there an obvious solution for this >> kind of error: >> >> Caused by: org.apache.xmpbox.xml.XmpParsingException: Cannot find a >> definition for the namespace http://ns.adobe.com/lightroom/1.0/ >> at >> org.apache.xmpbox.xml.DomXmpParser.checkPropertyDefinition(DomXmpParser.java:848) >> at >> org.apache.xmpbox.xml.DomXmpParser.parseChildrenAsProperties(DomXmpParser.java:290) >> at >> org.apache.xmpbox.xml.DomXmpParser.parseDescriptionRoot(DomXmpParser.java:234) >> at >> org.apache.xmpbox.xml.DomXmpParser.parse(DomXmpParser.java:198) >> at >> org.apache.xmpbox.xml.DomXmpParser.parse(DomXmpParser.java:105) >> at >> org.apache.tika.parser.image.xmp.JempboxExtractor.parse(JempboxExtractor.java:59) >> >> On a handful of image files in our test docs on Tika, I'm getting this with: >> >> http://ns.adobe.com/lightroom/1.0/ >> http://ns.adobe.com/exif/1.0/aux/ >> > > These namespaces are not supported by xmpbox. We've had this problem with > another namespace (I can't remember which one), and it wasn't possible to > support it because we couldn't find a schema definition. > > But you say these are image files. So this isn't about pdf xmp. xmpbox is targeted around PDF/A-1. So I'd think we should discuss to extend it to support other PDF standard meta data requirements as well as generic XMP use cases to again have a generic XMP library. OTOH there is org.apache.xmlgraphics.xmp WDYT? BR Maruan > > Tilman > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org > For additional commands, e-mail: users-h...@pdfbox.apache.org > --------------------------------------------------------------------- To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org For additional commands, e-mail: users-h...@pdfbox.apache.org