[ 
https://issues.apache.org/jira/browse/PDFBOX-466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12708594#action_12708594
 ] 

Sean Bridges commented on PDFBOX-466:
-------------------------------------

Looking at one of the pdf's, It ends with,

<< 
/Producer (Powered By Crystal)  
/Creator (Crystal Reports)  
>> 
endobj 
xref 
0 36 
0000000000 65535 f 
0000000017 00000 n 
0000037961 00000 n 
0000038060 00000 n 
0000038094 00000 n 
0000000194 00000 n 
0000038128 00000 n 
0000038250 00000 n 
0000038308 00000 n 
0000038400 00000 n 
0000055457 00000 n 
0000055511 00000 n 
0000056340 00000 n 
0000056516 00000 n 
0000056692 00000 n 
0000056868 00000 n 
0000057217 00000 n 
0000000823 00000 n 
0000057256 00000 n 
0000057524 00000 n 
0000001348 00000 n 
0000057567 00000 n 
0000057891 00000 n 
0000009425 00000 n 
0000057924 00000 n 
0000058191 00000 n 
0000009867 00000 n 
0000058234 00000 n 
0000058603 00000 n 
0000021478 00000 n 
0000058641 00000 n 
0000058908 00000 n 
0000022076 00000 n 
0000058951 00000 n 
0000058991 00000 n 
0000059028 00000 n 
trailer 
<< 
/Size 36 
/Root 1 0 R 
/Info 35 0 R 
>> 
startxref 
59116 
%%EOF 


The exception is thrown after reading the "0 36" after xref.  The line,

objectKey = readString( 3 );

Reads "000", which is not "obj", and the exception is thrown.

> error parsing files generated by crystal reports
> ------------------------------------------------
>
>                 Key: PDFBOX-466
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-466
>             Project: PDFBox
>          Issue Type: Bug
>          Components: FontBox
>            Reporter: Sean Bridges
>             Fix For: 0.8.0-incubator
>
>
> This is with the latest from svn, Revision: 773978
> From a sample of 13304 pdf documents generated in a very wide variety of 
> ways, I got 200 exceptions with the stack trace,
> Caused by: java.io.IOException: expected='obj' actual='000' 
> org.apache.pdfbox.io.pushbackinputstr...@1049d3
>       at org.apache.pdfbox.pdfparser.PDFParser.parseObject(PDFParser.java:471)
>       at org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:169)
>       at 
> message_analyzer.extractor.PDFExtractor.getContent(PDFExtractor.java:32)
>       ... 2 more
> I can't give an example file, but the pdfs are all generated by crystal 
> reports.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to