hi, thx for your replies.
I have test the snowballFilter and it does not stem the term "support". It means the term "support" should be in all the papers. However, i add the synonymFilter, the "support" is missing. I think i have to read the lucene source code again. yours truly Jiang Xing On 1/17/06, Erik Hatcher <[EMAIL PROTECTED]> wrote: > > > On Jan 17, 2006, at 12:14 AM, jason wrote: > > It is adding tokens into the same position as the original token. > > And then, > > I used the QueryParser for searching and the snowball analyzer for > > parsing. > > Ok, so you're only using the SynonymAnalyzer for indexing, and the > SnowballAnalyzer for QueryParser, correct? If so, that is reasonable. > > > public TokenStream tokenStream(String fieldName, Reader reader){ > > > > TokenStream result = new StandardTokenizer(reader); > > result = new StandardFilter(result); > > result = new LowerCaseFilter(result); > > if (stopword != null){ > > result = new StopFilter(result, stopword); > > } > > > > result = new SnowballFilter(result, "Lovins"); > > > > result = new SynonymFilter(result, engine); > > > > return result; > > } > > > > } > > I write some code in the snowballfitler (line 75-79). If i only > > used the > > snowballfilter, the term "support" can be found in all the 17 > > documents. > > However, if the code "result = new SynonymFilter(result, engine);" > > is used. > > The term "support" cannot be found in some documents. > > > It looks like you borrowed SynonymAnalyzer from the Lucene in Action > code. But you've tweaked some things. One thing that is clearly > amiss is that you're looking up synonyms for stemmed words, which is > not going to work (unless you stemmed the WordNet words beforehand, > but I doubt you did that and it would quite odd to do so). You're > probably not injecting many synonyms at all. > > I encourage you to "analyze your analyzer" by running some utilities > such as the Analyzer demo that comes with Lucene in Action's code. > You'll have some more insight into this issue when trying this out in > isolation from query parsing and other complexities. > > > /** Returns the next input Token, after being stemmed */ > > public final Token next() throws IOException { > > Token token = input.next(); > > if (token == null) > > return null; > > stemmer.setCurrent(token.termText()); > > try { > > stemMethod.invoke(stemmer, EMPTY_ARGS); > > } catch (Exception e) { > > throw new RuntimeException(e.toString()); > > } > > > > Token newToken = new Token(stemmer.getCurrent(), > > token.startOffset(), token.endOffset(), > > token.type()); > > //check the tokens. > > if(newToken.termText().equals("support")){ > > System.out.println("the term support is found"); > > } > > I'm not sure what the exact solution to your dilemma is, but doing > more testing with your analyzer will likely shed light on it for you. > > Erik > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > >