[ https://issues.apache.org/jira/browse/PDFBOX-1122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13988671#comment-13988671 ]
Rogério Pereira Araújo commented on PDFBOX-1122: ------------------------------------------------ I can confirm the same error on version 1.8.4 while parsing PDFs with Tika during Nutch parsing job. > Parsing Error, Skipping Object > ------------------------------ > > Key: PDFBOX-1122 > URL: https://issues.apache.org/jira/browse/PDFBOX-1122 > Project: PDFBox > Issue Type: Bug > Components: Parsing > Affects Versions: 1.6.0 > Environment: Working with Windows 7 in eclipse. > Reporter: Raihan Jamal > Assignee: Andreas Lehmkühler > Labels: pdfbox > Original Estimate: 336h > Remaining Estimate: 336h > > Parsing Error, Skipping Object > java.io.IOException: expected='endstream' actual='' > org.apache.pdfbox.io.PushBackInputStream@38011d45 > at > org.apache.pdfbox.pdfparser.BaseParser.parseCOSStream(BaseParser.java:439) > at org.apache.pdfbox.pdfparser.PDFParser.parseObject(PDFParser.java:552) > at org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:184) > at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:1088) > at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:1053) > at org.apache.tika.parser.pdf.PDFParser.parse(PDFParser.java:74) > at > org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:197) > at > org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:197) > at > org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:135) > at org.apache.tika.Tika.parseToString(Tika.java:357) > at > edu.uci.ics.crawler4j.crawler.BinaryParser.parse(BinaryParser.java:37) > at > edu.uci.ics.crawler4j.crawler.WebCrawler.handleBinary(WebCrawler.java:223) > at > edu.uci.ics.crawler4j.crawler.WebCrawler.processPage(WebCrawler.java:462) > at edu.uci.ics.crawler4j.crawler.WebCrawler.run(WebCrawler.java:129) > at java.lang.Thread.run(Thread.java:662) > Did not found XRef object at specified startxref position 0 > This is the sample URL where I am facing this problem:- > http://www.qualcomm.com/documents/files/rev-b-enhanced-mobile-broadband-for-all.pdf > Any suggestions why is it happening...!! Or its a bug?? -- This message was sent by Atlassian JIRA (v6.2#6252)