Well, this is embarrassing....

If I execute 'ant runtime' followed by applying the patch and running 'ant
runtime' again the runtime/local/plugins/lib-nekohtml/ folder contains both

nekohtml-0.9.5.jar and nekohtml-1.9.17.jar and I suspect that the file
that's actually loaded is 0.9.5.


If I execute 'ant clean' followed by 'ant runtime' (the patch was already
applied) the only file that's present is nekohtml-1.9.17.jar and i'm not
experiencing the error.


I would be glad to know why the 'ant clean' step is necessary and the
changes are not picked up by 'ant runtime'.


On Mon, Jan 20, 2014 at 1:54 PM, Lewis John Mcgibbney <
[email protected]> wrote:

> Hi d_k,
>
> On Mon, Jan 20, 2014 at 11:39 AM, <[email protected]>
> wrote:
>
> >
> > Posting back as promised. :-)
> >
>
> Great
>
>
> >
> > I just encountered the error "java.lang.NoClassDefFoundError:
> > org/cyberneko/html/parsers/DOMFragmentParser" and applied the patch
> > NUTCH-1253-2.x-v2.patch from NUTCH-1253 and executed 'ant runtime' and
> upon
> > running './nutch parse -all' (after injecting/generating/fetching) the
> > error did not go away and I still got the exception.
> >
> > OK so a few things here please.
> I see that the patch introduces trace logging in HtmlParser.class, are you
> able to change this to debug, then also set
>
> log4j.logger.org.apache.nutch.parse.ParserJob=INFO,cmdstdout
>
> to
>
> log4j.logger.org.apache.nutch.parse.ParserJob=DEBUG,cmdstdout
>
> in log4j.properties, this should hopefully remove the likelihood of trace
> logging setting this off.
>
> Can you confirm if DOMFragmentParser actually exists within the new
> nekohtml artifact 1.9.17 and that the old version is not present and being
> loaded instead. If this is the case then you will need to manually remove
> it or alternatively force it's removal with the ant clean target prior to
> invoking runtime target.
>
> Finally, it may be worth taking a look in to the hadoop.log to determine
> which URL(s) this error stems from? Can you post the relevant section of
> your log?
> Thank you
> Lewis
>

Reply via email to