Hi Angela, On Mon, Dec 8, 2014 at 12:02 AM, <[email protected]> wrote:
> > Following tutorial > > https://svn.apache.org/repos/asf/nutch/trunk/src/plugin/parse-tika/howto_upgrade_tika.txt > , > I > have downloaded the nutch trunk and built the Nutch to use a special tika > (1 > .7-SNAPSHOT). However, the tika-parser cannot parse any document with the > error that "Can't retrieve Tika parser for mime-type xxxx". > If I change the tika version back to the default 1.6. Then the tika-parser > works. Also, similar to posting > http://www.mail-archive.com/user%40nutch.apache.org/msg12067.html, this > problem could be avoided by running Nutch in the Eclipse instead of with > shell. But anyone knows about the reasons of the problem? And maybe how to > solve it? Many thanks. > > The short answer is no. I don't know why this behavior results when we use SNAPSHOT's. It is puzzling. http://mail-archives.apache.org/mod_mbox/nutch-user/201210.mbox/%[email protected]%3E I've been aware of unpredictable results like what you are experiencing for a long time. This may even have something to do with how Ivy is managing dependencies within and on behalf of Nutch. The artifacts we publish within Tika and Maven SNAPSHOT's so there may be a mismatch there. If this were the case I would not be surprised and it would not be the first time I've come across this. We need to go DEEP here and DEBUG right down. That is all I can suggest, sorry. Lewis
