[ https://issues.apache.org/jira/browse/PDFBOX-2497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14212053#comment-14212053 ]
Tilman Hausherr commented on PDFBOX-2497: ----------------------------------------- Indeed... sorry, I gave the wrong link > GRAVE: FlateFilter: stop reading corrupt stream due to a DataFormatException > ---------------------------------------------------------------------------- > > Key: PDFBOX-2497 > URL: https://issues.apache.org/jira/browse/PDFBOX-2497 > Project: PDFBox > Issue Type: Bug > Components: Text extraction > Affects Versions: 1.8.6, 1.8.7 > Environment: java version "1.7.0_65" > Java(TM) SE Runtime Environment (build 1.7.0_65-b17) > Java HotSpot(TM) 64-Bit Server VM (build 24.65-b04, mixed mode) > Linux 3.2.0-4-amd64 #1 SMP Debian 3.2.60-1+deb7u3 x86_64 GNU/Linux > Reporter: Laurent Roger > Priority: Minor > Attachments: git-community-book-52.pdf, git-community-book-53.pdf, > git-community-book.pdf, git-community-book.txt > > > java -jar pdfbox-app-1.8.7.jar ExtractText git-community-book.pdf > git-community-book.txt > throws the error : > nov. 13, 2014 5:34:31 PM org.apache.pdfbox.filter.FlateFilter decode > GRAVE: FlateFilter: stop reading corrupt stream due to a DataFormatException > The txt document is incomplete, stops at > Chapter 7: Fonctionnement Interne et Plomberie > 133 > Git Community Book > 134 > PdfDebugger does not show any problem of structure of git-community-book.pdf. > PDFSplit -split 1 fails at git-community-book-53.pdf (partially written) > java -jar pdfbox-app-1.8.7.jar PDFDebugger git-community-book-53.pdf > Xlib: extension "RANDR" missing on display ":0.0". > nov. 13, 2014 5:30:32 PM org.apache.pdfbox.pdfparser.BaseParser > parseCOSDictionary > AVERTISSEMENT: Bad Dictionary Declaration > org.apache.pdfbox.io.PushBackInputStream@78900845 > nov. 13, 2014 5:30:32 PM org.apache.pdfbox.pdfparser.BaseParser > parseCOSDictionary > AVERTISSEMENT: Invalid dictionary, found: '���' but expected: '/' > nov. 13, 2014 5:30:32 PM org.apache.pdfbox.pdfparser.XrefTrailerResolver > setStartxref > AVERTISSEMENT: Did not found XRef object at specified startxref position 0 -- This message was sent by Atlassian JIRA (v6.3.4#6332)