[ 
https://issues.apache.org/jira/browse/PDFBOX-2829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14583804#comment-14583804
 ] 

Tilman Hausherr edited comment on PDFBOX-2829 at 6/12/15 6:05 PM:
------------------------------------------------------------------

The two error messages should be downgraded to warnings, they mean that a 
length parameter in a stream is incorrect, and PDFBox then tries to read the 
stream until it finds "endstream" instead of relying on the length parameter. 
This is because your PDF is malformed, somehow all newlines were transformed in 
0d 0a. Either the creator messed up, or they were transferred in ascii mode 
from a unix to a non unix system. I will do that and also work on the text, 
which is confusing.

About the exception - are you using the latest version from svn? I can display 
the file with the PDFReader command line utility.


was (Author: tilman):
The two error messages should be downgraded to warnings, they mean that a 
length parameter in a stream is incorrect, and PDFBox then tries to read the 
file sequentially. This is because your PDF is malformed, somehow all newlines 
were transformed in 0d 0a. Either the creator messed up, or they were 
transferred in ascii mode from a unix to a non unix system. I will do that and 
also work on the text, which is confusing.

About the exception - are you using the latest version from svn? I can display 
the file with PDFReader command line utility.

> PDBox 2.0 Throws IndexOutOfBoundsException (severe offset errors as well)
> -------------------------------------------------------------------------
>
>                 Key: PDFBOX-2829
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-2829
>             Project: PDFBox
>          Issue Type: Improvement
>    Affects Versions: 2.0.0
>            Reporter: Lori
>              Labels: PDFDocument
>         Attachments: Tracy_Party.pdf
>
>
> The pdf file comes up in adobe okay.  It has an unusally font.  And the load 
> complains about  incorrect offsets.   
> Jun 12, 2015 12:27:10 PM org.apache.pdfbox.pdfparser.COSParser 
> validateStreamLength
> SEVERE: The end of the stream doesn't point to the correct offset, using 
> workaround to read the stream, found 576 but expected 6095
> Jun 12, 2015 12:27:10 PM org.apache.pdfbox.pdfparser.COSParser 
> validateStreamLength
> SEVERE: The end of the stream doesn't point to the correct offset, using 
> workaround to read the stream, found 6513 but expected 8951
> Exception in thread "main" java.lang.IndexOutOfBoundsException: Index: 10, 
> Size: 10
>       at java.util.ArrayList.rangeCheck(ArrayList.java:635)
>       at java.util.ArrayList.get(ArrayList.java:411)
>       at 
> org.apache.pdfbox.io.RandomAccessBuffer.nextBuffer(RandomAccessBuffer.java:395)
>       at 
> org.apache.pdfbox.io.RandomAccessBuffer.read(RandomAccessBuffer.java:260)
>       at 
> org.apache.pdfbox.pdfparser.BaseParser.readUntilEndStream(BaseParser.java:412)
>       at 
> org.apache.pdfbox.pdfparser.COSParser.parseCOSStream(COSParser.java:922)
>       at 
> org.apache.pdfbox.pdfparser.COSParser.parseFileObject(COSParser.java:725)
>       at 
> org.apache.pdfbox.pdfparser.COSParser.parseObjectDynamically(COSParser.java:686)
>       at 
> org.apache.pdfbox.pdfparser.COSParser.parseObjectDynamically(COSParser.java:639)
>       at 
> org.apache.pdfbox.pdfparser.COSParser.parseDictObjects(COSParser.java:600)
>       at 
> org.apache.pdfbox.pdfparser.PDFParser.initialParse(PDFParser.java:198)
>       at org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:225)
>       at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:976)
>       at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:865)
>       at 
> com.smartfin.backend.sfreceipts.offline.PdfToImage2_0RedoTesterFor.getImageFromPdf(PdfToImage2_0RedoTesterFor.java:47)
>       at 
> com.smartfin.backend.sfreceipts.offline.PdfToImage2_0RedoTesterFor.getImageFromPdf(PdfToImage2_0RedoTesterFor.java:39)
>       at 
> com.smartfin.backend.sfreceipts.offline.PdfToImage2_0RedoTesterFor.main(PdfToImage2_0RedoTesterFor.java:199)
>       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>       at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>       at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>       at java.lang.reflect.Method.invoke(Method.java:606)
>       at com.intellij.rt.execution.application.AppMain.main(AppMain.java:140)
> Process finished with exit code 1



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org

Reply via email to