Hi Lewis, Thanks for your feedback. I went through the process step by step and I'm still getting the error :
my plugins folder looks like this : [image: Inline image 1] When I ran the parse job it gave me this : [image: Inline image 2] when I look at the log file, I get this : [image: Inline image 3] My nutch-site.xml contains this : <property> <name>plugin.includes</name> <value>protocol-http|urlfilter-regex|parse-(html|tika)|index-(basic|anchor)|urlnormalizer-(pass|regex|basic)|scoring-opic</value> <description>Regular expression naming plugin directory names to include. Any plugin not matching this expression is excluded. In any case you need at least include the nutch-extensionpoints plugin. By default Nutch includes crawling just HTML and plain text via HTTP, and basic indexing and search plugins. In order to use HTTPS please enable protocol-httpclient, but be aware of possible intermittent problems with the underlying commons-httpclient library. </description> </property> am I missing something else ? Thanks for your precious help. Arcondo. On Thu, Jan 3, 2013 at 11:20 PM, Lewis John Mcgibbney < [email protected]> wrote: > Hi Arcondo, > > The nekohtml jar should be version 0.9.5, and should reside in > build/plugins/lib-nekohtml once you build Nutch from source. > Once you use the default 'runtime' target, the corresponding plugins > folders should be copied into runtime/local/plugins > Can you check that the jar is copied to this directory before attempting to > parse th6e URLs in your segment(s) if using 1.x. > I'm also assuming that you have parse-html included in the plugin.includes > property within nutch-site.xml before building the source. > > Lewis > > On Thu, Jan 3, 2013 at 9:11 PM, Arcondo Dasilva > <[email protected]>wrote: > > > Thanks for the explanation. I'm more a functional guy with no solid > > background in Java. > > Could you give some details on how to enforce it manually ? > > > > Thanks in advance, Arcondo > > > > > > > > On Thu, Jan 3, 2013 at 2:49 PM, Lewis John Mcgibbney < > > [email protected]> wrote: > > > > > the jar is not on the classpath > > > > > > -- > *Lewis* >

