[ https://issues.apache.org/jira/browse/NUTCH-1154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13124451#comment-13124451 ]
Andrzej Bialecki commented on NUTCH-1154: ------------------------------------------ The case for inclusion is here http://s.apache.org/vR :) that is, Tika 0.10 has several important improvements over 0.9. With the attached patch all tests pass except TestRTFParser, due to an issue that just has been fixed in Tika trunk. The underlying problem is that our test document is malformed and Tika's new RTF parser wasn't robust enough to handle this. This means that for now we would have to disable this test, and re-enable it once we upgrade to Tika 1.0. > Upgrade to Tika 0.10 > -------------------- > > Key: NUTCH-1154 > URL: https://issues.apache.org/jira/browse/NUTCH-1154 > Project: Nutch > Issue Type: Improvement > Components: parser > Affects Versions: 1.4 > Reporter: Andrzej Bialecki > Attachments: NUTCH-1154.diff > > > There have been significant improvements in Tika 0.10 and it would be nice to > use the latest Tika in 1.4. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira