Create a document parser that doesn't HTMLify the results.
----------------------------------------------------------
Key: DROIDS-81
URL: https://issues.apache.org/jira/browse/DROIDS-81
Project: Droids
Issue Type: Bug
Components: tika
Affects Versions: 0.01
Reporter: Richard Frovarp
Priority: Minor
While the TikaHTMLParser can parse pdfs, docs, etc, it returns them in an
HTMLified format. Solr blows up on that format, and it isn't always necessary
to do this step anyway.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.