Hi, Robin Diederen schrieb: > Hello all, > > I'm quite new to PDFbox and currently figuring out how to extract metadata > from PDF files which is in XMP format. > > I have a few files containing XMP metadata, but I can not get any of those to > work. And I can't seem to figure out where I am failing. > > A code snippet (all non-relevant code was deleted): > > String inputFile = "/some/file.pdf" > > PDDocument pdfDocument = null; > pdfDocument = new PDDocument(); > pdfDocument = PDDocument.load(inputFile); > PDMetadata pdfMetaData = new PDMetadata(pdfDocument); > > int metadataLength = pdfMetaData.getLength(); > System.out.println(pdfMetaData.getLength()); > > > pdfMetaData.exportXMPMetadata(); > > > The getLength call always returns 0; the exportXMPMetadata call returns an > error: > > [Fatal Error] :-1:-1: Premature end of file. > Exception in thread "main" java.io.IOException: Premature end of file. > at org.apache.jempbox.impl.XMLUtil.parse(XMLUtil.java:78) > at org.apache.jempbox.xmp.XMPMetadata.load(XMPMetadata.java:554) > at > org.apache.pdfbox.pdmodel.common.PDMetadata.exportXMPMetadata(PDMetadata.java:86) > at > com.robindiederen.pdf.Extractor.extractMetaDataFromXMP(Extractor.java:124) > at com.robindiederen.pdf.Extractor.main(Extractor.java:90) > > > > This happens for every PDF I test. Extracting metadata from the > DocumentInformation table works as a charm. I'm using PDFbox 0.80 on Java 1.5. Have a look at PrintDocumentMetaData as an example how to extract the docs metadata.
HTH Andreas Lehmkühler
