Use Tika to a fuller extent
---------------------------
Key: DROIDS-157
URL: https://issues.apache.org/jira/browse/DROIDS-157
Project: Droids
Issue Type: New Feature
Components: tika
Affects Versions: 0.0.2
Reporter: Richard Frovarp
Assignee: Richard Frovarp
We should be using Tika to a greater extent. New versions of Tika can do some
of the things we've wrote our own code for.
In addition, new content handlers can provide interesting data. The
BoilerpipeContentHandler will try to only grab the content that really matters.
The Metadata class can return all sorts of interesting values without having to
parse them out of the document yourself such as the title or robots meta field.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira