Create a document parser that doesn't HTMLify the results.
----------------------------------------------------------

                 Key: DROIDS-81
                 URL: https://issues.apache.org/jira/browse/DROIDS-81
             Project: Droids
          Issue Type: Bug
          Components: tika
    Affects Versions: 0.01
            Reporter: Richard Frovarp
            Priority: Minor


While the TikaHTMLParser can parse pdfs, docs, etc, it returns them in an 
HTMLified format. Solr blows up on that format, and it isn't always necessary 
to do this step anyway. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to