[
https://issues.apache.org/jira/browse/PDFBOX-813?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jukka Zitting resolved PDFBOX-813.
----------------------------------
Resolution: Fixed
Fix Version/s: 1.3.0
Assignee: Jukka Zitting
Fixed in revision 1022444 by explicitly checking the type of the Root object.
I think this is a better solution than dropping (or deprecating) the
forceParsing option, as there are quite a few malformed PDFs out there that can
still be processed reasonably well even with relaxed parsing rules. Sometimes
this results in PDDocuments with unexpected internal structures, but it's IMHO
better to try degrading gracefully in such cases.
> ClassCastException: COSInteger cannot be cast to COSDictionary
> --------------------------------------------------------------
>
> Key: PDFBOX-813
> URL: https://issues.apache.org/jira/browse/PDFBOX-813
> Project: PDFBox
> Issue Type: Bug
> Components: Parsing
> Affects Versions: 1.2.1, 1.3.0
> Environment: Windows XP
> java version "1.6.0_12"
> Java(TM) SE Runtime Environment (build 1.6.0_12-b04)
> Java HotSpot(TM) Client VM (build 11.2-b01, mixed mode, sharing)
> Reporter: CP
> Assignee: Jukka Zitting
> Priority: Critical
> Fix For: 1.3.0
>
> Attachments: CancerSummReport_34914.pdf, PDFBoxBug.java
>
>
> I get the below exceptions when calling
> pdfDoc.getDocumentCatalog().getAllPages(). The code continues after the first
> exception because I've called
> PDDocument.load("C:/CancerSummReport_34914.pdf", true) setting the load
> "force" param to true. The second exception causes the code to abort.
> (I will try uploading the PDF that causes this problem)
> 2010-09-02 16:47:47,521 [main] WARN (PDFParser.java:189) - Parsing Error,
> Skipping Object
> java.io.IOException: Error: Expected an integer type, actual='bj'
> at org.apache.pdfbox.pdfparser.BaseParser.readInt(BaseParser.java:1310)
> at org.apache.pdfbox.pdfparser.PDFParser.parseObject(PDFParser.java:497)
> at org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:179)
> at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:878)
> at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:843)
> at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:768)
> at com.xyz.framework.functionalTests.PDFBoxBug.main(PDFBoxBug.java:16)
> 2010-09-02 16:47:47,552 [main] WARN (BaseParser.java:215) - Invalid
> dictionary, found:? but expected:''
> Exception in thread "main" java.lang.ClassCastException:
> org.apache.pdfbox.cos.COSInteger cannot be cast to
> org.apache.pdfbox.cos.COSDictionary
> at
> org.apache.pdfbox.pdmodel.PDDocument.getDocumentCatalog(PDDocument.java:414)
> at com.xyz.framework.functionalTests.PDFBoxBug.main(PDFBoxBug.java:18)
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.