[ https://issues.apache.org/jira/browse/PDFBOX-2733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14385378#comment-14385378 ]
Tilman Hausherr commented on PDFBOX-2733: ----------------------------------------- The file is broken. Here's an excerpt of the validation from PDF-Tools (this is for PDF/A-1b, I've deleted the parts only relevant to PDF/A): {quote} Validating file "PDFBOX-2733.pdf" for conformance level pdfa-1b The 'xref' keyword was not found or the xref table is malformed. The file trailer dictionary is missing or invalid. The comment, classifying the file as containing 8-bit binary data, is missing. The file trailer dictionary must have an id key. The file format (header, trailer, objects, xref, streams) is corrupted. {quote} One of the causes is this: {code} << /Prev 0 /Root 5 0 R /Size 6 >> {code} Definition of Prev: {quote} The byte offset from the beginning of the file to the beginning of the previous cross-reference section. {quote} So it makes no sense that it is 0. Adobe Reader offers to save the file when closing. It does this when the file is broken. I'll test a small fix. But if you can, you should return the scanner to the seller :-) > Nullpointer exception in PDFXrefStreamParser.parse > -------------------------------------------------- > > Key: PDFBOX-2733 > URL: https://issues.apache.org/jira/browse/PDFBOX-2733 > Project: PDFBox > Issue Type: Bug > Components: Parsing > Affects Versions: 2.0.0 > Environment: windows 7 > Reporter: jerome girardini > Attachments: scan-canon-windows8.pdf > > > with some pdf, an nullpointer is sent during the parsing > +{quote} > Here is the trace : > Caused by: java.lang.NullPointerException > at > org.apache.pdfbox.pdfparser.PDFXrefStreamParser.parse(PDFXrefStreamParser.java:91) > at > org.apache.pdfbox.pdfparser.COSParser.parseXrefStream(COSParser.java:1836) > at > org.apache.pdfbox.pdfparser.COSParser.parseXrefObjStream(COSParser.java:320) > at org.apache.pdfbox.pdfparser.COSParser.parseXref(COSParser.java:280) > at > org.apache.pdfbox.pdfparser.PDFParser.initialParse(PDFParser.java:314) > at org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:373) > at ch.ge.afc.ael.commun.piecejointe.UtiPdf.loadDocument(UtiPdf.java:439) > {quote}+ -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org