[
https://issues.apache.org/jira/browse/PDFBOX-813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12905681#action_12905681
]
Adam Nichols commented on PDFBOX-813:
-------------------------------------
This PDF does not conform to the Adobe PDF specification. If you open the PDF
in a text editor and scroll down to the bottom, you'll see a random "bj" after
the %%EOF marker. That's invalid, but PDFBox is cool about it and it just
warns you of this problem, it doesn't cause any serious problem. The second
message also is merely a warning.
Having said that, I tested using latest code from SVN and this PDF loaded
properly. I tested both with PDDocument.load( inputpath); and
PDDocument.load(inputpath, true);
What version of PDFBox are you using? I'd suggest trying the latest from SVN
if you can. The logs you posted so not contain any stacktraces which caused
the code to abort. There should be some other stacktrace which has more
information.
> ClassCastException: COSInteger cannot be cast to COSDictionary
> --------------------------------------------------------------
>
> Key: PDFBOX-813
> URL: https://issues.apache.org/jira/browse/PDFBOX-813
> Project: PDFBox
> Issue Type: Bug
> Components: Parsing
> Affects Versions: 1.2.1
> Environment: Windows XP
> java version "1.6.0_12"
> Java(TM) SE Runtime Environment (build 1.6.0_12-b04)
> Java HotSpot(TM) Client VM (build 11.2-b01, mixed mode, sharing)
> Reporter: CP
> Priority: Critical
> Attachments: CancerSummReport_34914.pdf, PDFBoxBug.java
>
>
> I get the below exceptions when calling
> pdfDoc.getDocumentCatalog().getAllPages(). The code continues after the first
> exception because I've called
> PDDocument.load("C:/CancerSummReport_34914.pdf", true) setting the load
> "force" param to true. The second exception causes the code to abort.
> (I will try uploading the PDF that causes this problem)
> 2010-09-02 16:47:47,521 [main] WARN (PDFParser.java:189) - Parsing Error,
> Skipping Object
> java.io.IOException: Error: Expected an integer type, actual='bj'
> at org.apache.pdfbox.pdfparser.BaseParser.readInt(BaseParser.java:1310)
> at org.apache.pdfbox.pdfparser.PDFParser.parseObject(PDFParser.java:497)
> at org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:179)
> at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:878)
> at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:843)
> at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:768)
> at com.xyz.framework.functionalTests.PDFBoxBug.main(PDFBoxBug.java:16)
> 2010-09-02 16:47:47,552 [main] WARN (BaseParser.java:215) - Invalid
> dictionary, found:? but expected:''
> Exception in thread "main" java.lang.ClassCastException:
> org.apache.pdfbox.cos.COSInteger cannot be cast to
> org.apache.pdfbox.cos.COSDictionary
> at
> org.apache.pdfbox.pdmodel.PDDocument.getDocumentCatalog(PDDocument.java:414)
> at com.xyz.framework.functionalTests.PDFBoxBug.main(PDFBoxBug.java:18)
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.