[ 
https://issues.apache.org/jira/browse/PDFBOX-2845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14639199#comment-14639199
 ] 

Andrea Vacondio commented on PDFBOX-2845:
-----------------------------------------

I took a quick look and the issue is that object 515 is a stream, its length is 
an indirect object (554) which is defined in an Object Stream. Currently PDFBox 
requires the stream length to not being defined in an object stream as per PDF 
spec chap. 7.5.7.

> Error parsing PDF
> -----------------
>
>                 Key: PDFBOX-2845
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-2845
>             Project: PDFBox
>          Issue Type: Bug
>          Components: Parsing
>    Affects Versions: 2.0.0
>            Reporter: Christopher Clark
>             Fix For: 2.0.0
>
>
> I get the following error when parsing this pdf:  
> http://jmlr.csail.mit.edu/proceedings/papers/v28/ranganath13.pdf
> java.io.IOException: Object must be defined and must not be compressed 
> object: 554:0
> Stack trace:
> Exception in thread "main" java.io.IOException: Object must be defined and 
> must not be compressed object: 554:0
>         at 
> org.apache.pdfbox.pdfparser.COSParser.parseObjectDynamically(COSParser.java:682)
>         at 
> org.apache.pdfbox.pdfparser.COSParser.parseObjectDynamically(COSParser.java:646)
>         at org.apache.pdfbox.pdfparser.COSParser.getLength(COSParser.java:847)
>         at 
> org.apache.pdfbox.pdfparser.COSParser.parseCOSStream(COSParser.java:906)
>         at 
> org.apache.pdfbox.pdfparser.COSParser.parseFileObject(COSParser.java:732)
>         at 
> org.apache.pdfbox.pdfparser.COSParser.parseObjectDynamically(COSParser.java:693)
>         at 
> org.apache.pdfbox.pdfparser.COSParser.parseObjectDynamically(COSParser.java:646)
>         at 
> org.apache.pdfbox.pdfparser.COSParser.parseDictObjects(COSParser.java:607)
>         at 
> org.apache.pdfbox.pdfparser.PDFParser.initialParse(PDFParser.java:198)
>         at org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:225)
>         at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:848)
>         at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:793)
>         at 
> org.apache.pdfbox.tools.ExtractText.startExtraction(ExtractText.java:192)
>         at org.apache.pdfbox.tools.ExtractText.main(ExtractText.java:81)
>         at org.apache.pdfbox.tools.PDFBox.main(PDFBox.java:55)
> Note this problem does not occur in 1.8.9



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org

Reply via email to