Hi,

I'm interested in building a Nutch plugin. I am having trouble getting the example "recommended" plugin to work - I followed all of the steps in http://wiki.apache.org/nutch/WritingPluginExample-0%2e9, confirmed after I ran the top-level ant that build/plugins/recommended contained the plugin.xml and jar file for the 'recommended' plugin, and then tried crawling a single page from a local webserver that contains the test content (with the ="recommended" meta tag) from the example. Although the page got crawled/indexed and I can search for it, I see no evidence of any rank boosting on the "explain" search link, and when I look at NUTCHDIR/logs/hadoop.log I don't see any indication that the recommended filter got loaded by the crawl.

If anyone has suggestions I'd appreciate hearing them.

Also, a couple of things I notice that I didn't understand and/or looked odd from the example wiki page:

1. In the section on "Getting Ant to Compile Your Plugin", it said to add the line into NUTCHDIR/src/plugin/build.xml:
<ant dir="reccomended" target="deploy" />

There's an extra "c" in there (typo). (I fixed my local copy before I ran the crawl; telling you in case you want to update the wiki; I don't want to edit it myself until I have actually gotten it working...)

2. In the section on "Getting Nutch to Use Your Plugin" it said to add a regex to include the id of the plugin, using the example: <value>recommended|protocol-http|urlfilter-regex|parse-(text|html|js)|index-basic|query-(basic|site|url)|summary-basic|scoring-opic|urlnormalizer-(pass|regex|basic)</value>

But the <description> just above this part says you need to at least include the nutch-extensionpoints plugin (which is not present in this line). I notice from the wiki edit history you used to have the nutch-extensionpoints plugin in there and removed it, so I'm not sure which way it's supposed to be -- what's correct?

(I tried it both with and without the nutch-extensionpoints and neither way worked for me.)

Thanks
 - Mike Schwartz

Reply via email to