[
https://issues.apache.org/jira/browse/PDFBOX-466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12708594#action_12708594
]
Sean Bridges commented on PDFBOX-466:
-------------------------------------
Looking at one of the pdf's, It ends with,
<<
/Producer (Powered By Crystal)
/Creator (Crystal Reports)
>>
endobj
xref
0 36
0000000000 65535 f
0000000017 00000 n
0000037961 00000 n
0000038060 00000 n
0000038094 00000 n
0000000194 00000 n
0000038128 00000 n
0000038250 00000 n
0000038308 00000 n
0000038400 00000 n
0000055457 00000 n
0000055511 00000 n
0000056340 00000 n
0000056516 00000 n
0000056692 00000 n
0000056868 00000 n
0000057217 00000 n
0000000823 00000 n
0000057256 00000 n
0000057524 00000 n
0000001348 00000 n
0000057567 00000 n
0000057891 00000 n
0000009425 00000 n
0000057924 00000 n
0000058191 00000 n
0000009867 00000 n
0000058234 00000 n
0000058603 00000 n
0000021478 00000 n
0000058641 00000 n
0000058908 00000 n
0000022076 00000 n
0000058951 00000 n
0000058991 00000 n
0000059028 00000 n
trailer
<<
/Size 36
/Root 1 0 R
/Info 35 0 R
>>
startxref
59116
%%EOF
The exception is thrown after reading the "0 36" after xref. The line,
objectKey = readString( 3 );
Reads "000", which is not "obj", and the exception is thrown.
> error parsing files generated by crystal reports
> ------------------------------------------------
>
> Key: PDFBOX-466
> URL: https://issues.apache.org/jira/browse/PDFBOX-466
> Project: PDFBox
> Issue Type: Bug
> Components: FontBox
> Reporter: Sean Bridges
> Fix For: 0.8.0-incubator
>
>
> This is with the latest from svn, Revision: 773978
> From a sample of 13304 pdf documents generated in a very wide variety of
> ways, I got 200 exceptions with the stack trace,
> Caused by: java.io.IOException: expected='obj' actual='000'
> org.apache.pdfbox.io.pushbackinputstr...@1049d3
> at org.apache.pdfbox.pdfparser.PDFParser.parseObject(PDFParser.java:471)
> at org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:169)
> at
> message_analyzer.extractor.PDFExtractor.getContent(PDFExtractor.java:32)
> ... 2 more
> I can't give an example file, but the pdfs are all generated by crystal
> reports.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.