Ok so now the plugin is working, it changes the analyzer to the
SnowballAnalyzer but when I parse the query some letters end up
being stripped, like for instance if I search for "exchanges" it gets turned
into "exchang" and of course not getting any results, what could be the
cause of this? as far as I can see the SnowballAnalyzer is being loaded and
used for crawling how can I make sure that this analyzer is used by nutch
for both querying and crawling? do I need to modify any nutch classes or
maybe I need something extra in my plugin code?...this is really confusing
hope anyone can help me.
Here's the code for the snowball analyzer plugin I'm using:
public class SnowballAnalyzer extends NutchAnalyzer {
private static String[] stopWords = null;
private static int counter = 0;
static{
stopWords = new String[StopAnalyzer.ENGLISH_STOP_WORDS_SET.size()];
for (Object o : StopAnalyzer.ENGLISH_STOP_WORDS_SET) {
stopWords[counter++] = o.toString();
}
}
private static final Analyzer ANALYZER = new
org.apache.lucene.analysis.snowball.SnowballAnalyzer(Version.LUCENE_CURRENT,
"English", stopWords);
/** Creates a new instance of SnowballAnalyzer */
public SnowballAnalyzer () {
}
public TokenStream tokenStream(String fieldName, Reader reader) {
return ANALYZER.tokenStream(fieldName, reader);
}
}
Thanks.
On 19 August 2010 21:09, Roger Marin <[email protected]> wrote:
> Hello,
>
> Is it possible to change the lucene analyzer that nutch uses by default? I
> would like to use the snowball analyzer to search and crawl, I tried
> creating a plugin based on the analysis-fr and alaysis-dr plugins but it
> didn't work, not sure if i need to create a plugin for querying too.
> I would also like to allow stemming but i cannot find any info on this, do
> i need to modify source code? configuration files?.
>
> I appreciate any help you can give me, thanks.
>
>
> Roger Mairn
>