Well, this is embarrassing.... If I execute 'ant runtime' followed by applying the patch and running 'ant runtime' again the runtime/local/plugins/lib-nekohtml/ folder contains both
nekohtml-0.9.5.jar and nekohtml-1.9.17.jar and I suspect that the file that's actually loaded is 0.9.5. If I execute 'ant clean' followed by 'ant runtime' (the patch was already applied) the only file that's present is nekohtml-1.9.17.jar and i'm not experiencing the error. I would be glad to know why the 'ant clean' step is necessary and the changes are not picked up by 'ant runtime'. On Mon, Jan 20, 2014 at 1:54 PM, Lewis John Mcgibbney < [email protected]> wrote: > Hi d_k, > > On Mon, Jan 20, 2014 at 11:39 AM, <[email protected]> > wrote: > > > > > Posting back as promised. :-) > > > > Great > > > > > > I just encountered the error "java.lang.NoClassDefFoundError: > > org/cyberneko/html/parsers/DOMFragmentParser" and applied the patch > > NUTCH-1253-2.x-v2.patch from NUTCH-1253 and executed 'ant runtime' and > upon > > running './nutch parse -all' (after injecting/generating/fetching) the > > error did not go away and I still got the exception. > > > > OK so a few things here please. > I see that the patch introduces trace logging in HtmlParser.class, are you > able to change this to debug, then also set > > log4j.logger.org.apache.nutch.parse.ParserJob=INFO,cmdstdout > > to > > log4j.logger.org.apache.nutch.parse.ParserJob=DEBUG,cmdstdout > > in log4j.properties, this should hopefully remove the likelihood of trace > logging setting this off. > > Can you confirm if DOMFragmentParser actually exists within the new > nekohtml artifact 1.9.17 and that the old version is not present and being > loaded instead. If this is the case then you will need to manually remove > it or alternatively force it's removal with the ant clean target prior to > invoking runtime target. > > Finally, it may be worth taking a look in to the hadoop.log to determine > which URL(s) this error stems from? Can you post the relevant section of > your log? > Thank you > Lewis >

