hmm, i'm not sure but maybe we don't include all Tika parser deps in our build.xml?
-----Original message----- > From:Sebastian Nagel <wastl.na...@googlemail.com> > Sent: Thu 09-Aug-2012 23:18 > To: user@nutch.apache.org > Subject: Re: CHM Files and Tika > > Hi Jan, > > confirmed: Nutch cannot parse, while Tika (same version used by Nutch) > can parse chm. The chm parsers are in tika-parser*.jar which is contained > in the Nutch package. > > Any ideas? > > Sebastian > > On 08/08/2012 12:03 PM, Jan Riewe wrote: > > Hey there, > > > > i try to parse CHM (Microsoft Help Files) with Nucht, but i get a: > > > > Can't retrieve Tika parser for mime-type application/vnd.ms-htmlhelp > > > > i've tried version 1.4 (tika 0.10) and 1.51 from nutch (tika 1.1) which > > should be able to parse those files > > https://issues.apache.org/jira/browse/TIKA-245 > > > > In the tika-mimetypes.xml i do find a entry related to > > application/vnd.ms-htmlhelp > > > > Does anyone ever ran into the same issues and knows how to fix that? > > > > Bye > > Jan > > > >