Thank you very much. This has worked great and resolved the issue of
finding parser.

One interesting thing is out of 10 pdf files, it has crawled 2 files and
said unsuccessful for other pdf files. This has happened like 10 times for
now.

I really need to debug and put more error messages than just 'unable to
succesfully parse content ..'

Thanks again,
Kiran.

On Fri, Oct 26, 2012 at 4:16 AM, Julien Nioche <
lists.digitalpeb...@gmail.com> wrote:

> >
> > Is there anything wrong with my eclipse configuration? I am looking to
> > debug some  things in nutch, so i am working with eclipse and nutch.
>
>
> easier to follow the steps in Remote Debugging in Eclipse from
> http://wiki.apache.org/nutch/RunNutchInEclipse
>
> it will save you all sorts of classpath issues etc... note that this works
> in local mode only
>
> HTH
>
> Julien
>
>
> On 25 October 2012 19:44, kiran chitturi <chitturikira...@gmail.com>
> wrote:
>
> > Hi,
> >
> > i have built Nutch 2.x in eclipse using this tutorial (
> > http://wiki.apache.org/nutch/RunNutchInEclipse) and with some
> > modifications.
> >
> > Its able to parse html files successfully but when it comes to pdf files
> it
> > says 2012-10-25 14:37:05,071 ERROR tika.TikaParser - Can't retrieve Tika
> > parser for mime-type application/pdf
> >
> > Is there anything wrong with my eclipse configuration? I am looking to
> > debug some  things in nutch, so i am working with eclipse and nutch.
> >
> > Do i need to point any libraries for eclipseto recognize tika parsers for
> > application/pdf type ?
> >
> > What exactly is the reason for this type of error to appear for only pdf
> > files and not html files ? I am using recent nutch 2.x which has tika
> > upgraded to 1.2
> >
> > I would like some help here and would like to know if anyone has
> > encountered similar problem with eclipse, nutch 2.x and parsing
> > application/pdf files ?
> >
> > Many Thanks,
> > --
> > Kiran Chitturi
> >
>
>
>
> --
> *
> *Open Source Solutions for Text Engineering
>
> http://digitalpebble.blogspot.com/
> http://www.digitalpebble.com
> http://twitter.com/digitalpebble
>



-- 
Kiran Chitturi

Reply via email to