Solr (trunk rev. 912116) suffers from PDFBOX-537 [Endless loop in
org.apache.pdfbox.pdfparser.BaseParser.parseCOSDictionary()] fixed in PDFbox
1.0?
----------------------------------------------------------------------------------------------------------------------------------------------------
Key: SOLR-1786
URL: https://issues.apache.org/jira/browse/SOLR-1786
Project: Solr
Issue Type: Bug
Components: contrib - Solr Cell (Tika extraction)
Affects Versions: 1.5
Environment: Ubuntu 9.10, 32bit
Reporter: Jan Iwaszkiewicz
Priority: Critical
I tried indexing several thousand PDF documents but could not finish as Solr
was falling into an endless loop for some of them, for instance:
http://cdsweb.cern.ch/record/702585/files/sl-note-2000-019.pdf (the PDF seems
OK).
Can Solr start using PDFbox 1.0?
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.