Use Tika to a fuller extent
---------------------------

                 Key: DROIDS-157
                 URL: https://issues.apache.org/jira/browse/DROIDS-157
             Project: Droids
          Issue Type: New Feature
          Components: tika
    Affects Versions: 0.0.2
            Reporter: Richard Frovarp
            Assignee: Richard Frovarp


We should be using Tika to a greater extent. New versions of Tika can do some 
of the things we've wrote our own code for.
In addition, new content handlers can provide interesting data. The 
BoilerpipeContentHandler will try to only grab the content that really matters.
The Metadata class can return all sorts of interesting values without having to 
parse them out of the document yourself such as the title or robots meta field. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to