I have applied the patch mentioned in that thread. I have not made any
changes in parse-plugins.xml as
<mimeType name="text/html">
<plugin id="parse-tika" />
</mimeType>
<mimeType name="application/xhtml+xml">
<plugin id="parse-tika" />
</mimeType>
...
which were also not part of patch, and I have rebuild the source but I m
still getting the error of
NoClassDefFoundError: org/cyberneko/html/parsers/DOMFragmentParser
Thanks,
Tony.
On Fri, Jun 14, 2013 at 9:03 PM, feng lu <[email protected]> wrote:
> Do you rebuild Nutch from source after apply the patch.
>
> Maybe this refer can help you.
>
>
> http://lucene.472066.n3.nabble.com/Parsing-error-java-lang-NoClassDefFoundError-org-cyberneko-html-LostText-td4029958.html
>
>
> On Fri, Jun 14, 2013 at 10:43 PM, Tony Mullins <[email protected]
> >wrote:
>
> > Hi ,
> >
> > I was getting this error
> > "
> > Exception in thread "main" java.lang.NoClassDefFoundError:
> > org/cyberneko/html/parsers/DOMFragmentParser
> > at org.apache.nutch.parse.html.HtmlParser.parseNeko(HtmlParser.java:255)
> > at org.apache.nutch.parse.html.HtmlParser.parse(HtmlParser.java:238)
> > ....."
> >
> > and then I applied this patch .
> > https://issues.apache.org/jira/browse/NUTCH-1253
> >
> > And even after the patch I am still getting this error
> >
> > Exception in thread "main" java.lang.NoClassDefFoundError:
> > org/cyberneko/html/parsers/DOMFragmentParser
> > at
> > org.apache.nutch.parse.html.HtmlParser.parseNeko(HtmlParser.java:257)
> > at org.apache.nutch.parse.html.HtmlParser.parse(HtmlParser.java:238)
> > at
> org.apache.nutch.parse.html.HtmlParser.getParse(HtmlParser.java:173)
> > at org.apache.nutch.parse.ParseUtil.parse(ParseUtil.java:131)
> > at org.apache.nutch.parse.ParserChecker.run(ParserChecker.java:146)
> > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
> > at org.apache.nutch.parse.ParserChecker.main(ParserChecker.java:197)
> > Caused by: java.lang.ClassNotFoundException:
> > org.cyberneko.html.parsers.DOMFragmentParser
> >
> > Any idea how to resolve this issue ?
> >
> > Thanks,
> > Tony.
> >
>
>
>
> --
> Don't Grow Old, Grow Up... :-)
>