[ 
https://issues.apache.org/jira/browse/PDFBOX-606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12876715#action_12876715
 ] 

Nicholas Blair commented on PDFBOX-606:
---------------------------------------

I have a suspicion that the infinite loop may not be a problem for PDFbox to 
resolve. Looking closer at the stack trace and the method source, I'm wondering 
if the infinite loop was caused by temporary loss of connectivity to the 
underlying filesystem while reading the contents of a valid file.

> infinite loop encountered in PushBackInputStream.read
> -----------------------------------------------------
>
>                 Key: PDFBOX-606
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-606
>             Project: PDFBox
>          Issue Type: Bug
>          Components: Parsing
>    Affects Versions: 0.8.0-incubator
>            Reporter: Nicholas Blair
>         Attachments: ._pellochmar10.pdf
>
>
> While processing customer content for Lucene index using PDFBox, encountered 
> an infinite loop in PDDocument.load, stack trace:
> java.io.FileInputStream.readBytes(Native Method)
> java.io.FileInputStream.read(FileInputStream.java:199)
> java.io.BufferedInputStream.read1(BufferedInputStream.java:256)
> java.io.BufferedInputStream.read(BufferedInputStream.java:317)
>    - locked java.io.bufferedinputstr...@f5ef5d
> java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
> java.io.BufferedInputStream.read(BufferedInputStream.java:237)
>    - locked java.io.bufferedinputstr...@15b9c29
> java.io.FilterInputStream.read(FilterInputStream.java:66)
> java.io.PushbackInputStream.read(PushbackInputStream.java:122)
> org.apache.pdfbox.io.PushBackInputStream.read(PushBackInputStream.java:84)
> org.apache.pdfbox.pdfparser.BaseParser.skipSpaces(BaseParser.java:1190)
> org.apache.pdfbox.pdfparser.BaseParser.parseCOSDictionary(BaseParser.java:188)
> org.apache.pdfbox.pdfparser.PDFParser.parseTrailer(PDFParser.java:767)
> org.apache.pdfbox.pdfparser.PDFParser.parseObject(PDFParser.java:456)
> org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:179)
> org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:841)
> org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:808)
> edu.wisc.mywebspace.search.pdf.PdfDocumentContentParser.parse(PdfDocumentContentParser.java:47)
> Calling code looks like:
> document = PDDocument.load(inputStream);

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to