[ https://issues.apache.org/jira/browse/PDFBOX-1845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
David KELLER updated PDFBOX-1845: --------------------------------- Priority: Blocker (was: Major) > PDDocument.load() give Error: Expected a long type at offset 1633 > ----------------------------------------------------------------- > > Key: PDFBOX-1845 > URL: https://issues.apache.org/jira/browse/PDFBOX-1845 > Project: PDFBox > Issue Type: Bug > Components: Parsing > Affects Versions: 1.8.0, 2.0.0 > Environment: Windows 8.1 > Reporter: David KELLER > Priority: Blocker > > I run this simple program with the file in attachment (scanned OCR document > from Nuance Omnipage 18) > public static void main(String[] args) > throws Exception { > System.out.println("Start SplitFileTest..."); > String path = > "D:\\test\\batch\\scan_manual\\courrier\\david.keller\\"; > String pdfFile = path + "14 01 2014.pdf"; > > FileInputStream pdfInputStream = new FileInputStream(pdfFile); > > PDDocument pdDocument = PDDocument.load(pdfInputStream); > List<PDPage> pages = > pdDocument.getDocumentCatalog().getAllPages(); > > pdfInputStream.close(); > } > And with the 1.8.0 version I have this error : > java.io.IOException: Error: Expected an integer type, actual='12977[373' > at > org.apache.pdfbox.pdfparser.BaseParser.readInt(BaseParser.java:1622) > at > org.apache.pdfbox.pdfparser.PDFObjectStreamParser.parse(PDFObjectStreamParser.java:100) > at > org.apache.pdfbox.cos.COSDocument.dereferenceObjectStreams(COSDocument.java:604) > at org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:226) > at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:1187) > And I have just builded the 2.0.0 from the last code source and I have this > error : > java.io.IOException: Error: Expected a long type at offset 1633 > at org.apache.pdfbox.pdfparser.BaseParser.readLong(BaseParser.java:1682) > at > org.apache.pdfbox.pdfparser.PDFObjectStreamParser.parse(PDFObjectStreamParser.java:100) > at > org.apache.pdfbox.cos.COSDocument.dereferenceObjectStreams(COSDocument.java:663) > at org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:244) > at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:1101) > at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:1069) -- This message was sent by Atlassian JIRA (v6.1.5#6160)