[ https://issues.apache.org/jira/browse/PDFBOX-813?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jukka Zitting resolved PDFBOX-813. ---------------------------------- Resolution: Fixed Fix Version/s: 1.3.0 Assignee: Jukka Zitting Fixed in revision 1022444 by explicitly checking the type of the Root object. I think this is a better solution than dropping (or deprecating) the forceParsing option, as there are quite a few malformed PDFs out there that can still be processed reasonably well even with relaxed parsing rules. Sometimes this results in PDDocuments with unexpected internal structures, but it's IMHO better to try degrading gracefully in such cases. > ClassCastException: COSInteger cannot be cast to COSDictionary > -------------------------------------------------------------- > > Key: PDFBOX-813 > URL: https://issues.apache.org/jira/browse/PDFBOX-813 > Project: PDFBox > Issue Type: Bug > Components: Parsing > Affects Versions: 1.2.1, 1.3.0 > Environment: Windows XP > java version "1.6.0_12" > Java(TM) SE Runtime Environment (build 1.6.0_12-b04) > Java HotSpot(TM) Client VM (build 11.2-b01, mixed mode, sharing) > Reporter: CP > Assignee: Jukka Zitting > Priority: Critical > Fix For: 1.3.0 > > Attachments: CancerSummReport_34914.pdf, PDFBoxBug.java > > > I get the below exceptions when calling > pdfDoc.getDocumentCatalog().getAllPages(). The code continues after the first > exception because I've called > PDDocument.load("C:/CancerSummReport_34914.pdf", true) setting the load > "force" param to true. The second exception causes the code to abort. > (I will try uploading the PDF that causes this problem) > 2010-09-02 16:47:47,521 [main] WARN (PDFParser.java:189) - Parsing Error, > Skipping Object > java.io.IOException: Error: Expected an integer type, actual='bj' > at org.apache.pdfbox.pdfparser.BaseParser.readInt(BaseParser.java:1310) > at org.apache.pdfbox.pdfparser.PDFParser.parseObject(PDFParser.java:497) > at org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:179) > at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:878) > at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:843) > at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:768) > at com.xyz.framework.functionalTests.PDFBoxBug.main(PDFBoxBug.java:16) > 2010-09-02 16:47:47,552 [main] WARN (BaseParser.java:215) - Invalid > dictionary, found:? but expected:'' > Exception in thread "main" java.lang.ClassCastException: > org.apache.pdfbox.cos.COSInteger cannot be cast to > org.apache.pdfbox.cos.COSDictionary > at > org.apache.pdfbox.pdmodel.PDDocument.getDocumentCatalog(PDDocument.java:414) > at com.xyz.framework.functionalTests.PDFBoxBug.main(PDFBoxBug.java:18) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.