Hi All,
I am trying to work on the example for building a plugin for nutch here:
http://wiki.apache.org/nutch/WritingPluginExample-0.9
Using nutch-0.9 on Windows XP.
Now, i have set up things exactly as it is mentioned, and hence i actually
see somethings working - like the HTML Parser Extension is able to grab the
contents of the recommended meta tag and add them to the document being
parsed. The indexing filter extension also works in that it is able to add
the field 'recommended' to the lucene text index with the content of the
meta-tag. I use Luke to make sure the content within the meta tag is in the
new recommended field. I also use Luke to query on that content and it
works. Now the query filter doesnt really do what it is supposed to. My
Query Filter code looks like :
package org.apache.nutch.parse.recommended;
import org.apache.nutch.searcher.FieldQueryFilter;
import java.util.logging.Logger;
// Commons imports
import org.apache.commons.logging.Log;
import org.apache.commons.logging.LogFactory;
import org.apache.hadoop.conf.Configuration;
public class RecommendedQueryFilter extends FieldQueryFilter {
private static final Log LOG =
LogFactory.getLog(RecommendedParser.class.getName());
public RecommendedQueryFilter() {
super("recommended", 5f);
LOG.info("Added a recommended query");
}
public void setConf(Configuration conf) {
super.setConf(conf);
}
}
And my nutch-site.xml looks like:
<property>
<name>plugin.includes</name>
<value>recommended|nutch-extensionpoints|protocol-http|urlfilter-regex|parse-(text|html|js)|index-basic|query-(basic|site|url|recommendedSearcher)|summary-basic|scoring-opic|urlnormalizer-(pass|regex|basic)</value>
</property>
<property>
<name>searcher.dir</name>
<value>C:\nutch-0.9\yatry1\</value>
</property>
And my plugin.xml file looks like:
<?xml version="1.0" encoding="UTF-8"?>
<plugin
id="recommended"
name="Recommended Parser/Filter"
version="0.0.1"
provider-name="nutch.org">
<runtime>
<!-- As defined in build.xml this plugin will end up bundled as
recommended.jar -->
<library name="recommended.jar">
<export name="*"/>
</library>
</runtime>
<!-- The RecommendedParser extends the HtmlParseFilter to grab the
contents of
any recommended meta tags -->
<extension id="org.apache.nutch.parse.recommended.recommendedfilter"
name="Recommended Parser"
point="org.apache.nutch.parse.HtmlParseFilter">
<implementation id="RecommendedParser"
class="org.apache.nutch.parse.recommended.RecommendedParser"/>
</extension>
<!-- TheRecommendedIndexer extends the IndexingFilter in order to add the
contents
of the recommended meta tags (as found by the RecommendedParser) to
the lucene
index. -->
<extension id="org.apache.nutch.parse.recommended.recommendedindexer"
name="Recommended identifier filter"
point="org.apache.nutch.indexer.IndexingFilter">
<implementation id="RecommendedIndexer"
class="org.apache.nutch.parse.recommended.RecommendedIndexer"/>
</extension>
<!-- The RecommendedQueryFilter gets called when you perform a search. It
runs a
search for the user's query against the recommended fields. In
order to get
add this to the list of filters that gets run by default, you have
to use
"fields=DEFAULT". -->
<extension id="org.apache.nutch.parse.recommended.recommendedSearcher"
name="Recommended Search Query Filter"
point="org.apache.nutch.searcher.QueryFilter">
<implementation id="RecommendedQueryFilter"
class="org.apache.nutch.parse.recommended.RecommendedQueryFilter">
<parameter name="fields" value="recommended"/>
</implementation>
</extension>
</plugin>
I build nutch using ant and deploy a new war file - after making my changes.
But i am running out of ideas on what could be possibly wrong. Any ideas or
clues worth exploring would be greatly appreciated.
thanks
rahul