What is the appropriate way of achieving both stopwords and stemming of 
stopwords when the MoreLikeThis class is used? My analyzer 
(MoreLikeThis.setAnalyzer) uses the Snowball filter, and is initialized 
with a stopwords set:

analyzer = new StandardAnalyzer(stopwords) {
             public TokenStream tokenStream(String fieldName, 
java.io.Reader reader) {
             return new SnowballFilter(super
.tokenStream(fieldName,reader),
             "English");
             }
};



If I do NOTsupply a separate stopwords list to the MoreLikeThis object 
(that is, using MoreLikeThis.setStopWords), will "the right thing" happen; 
that is, will my input text to the MoreLikeThis object be stemmed and 
(stemmed) stopwords removed before a query is formed? It seems that 
MoreLikeThis.setStopWords uses a simple lookup of words in the stop words 
list (no stemming) which is not what I want.

Thanks in advance
Donna


Donna L. Gresh
Services Research, Mathematical Sciences Department
IBM T.J. Watson Research Center
(914) 945-2472
http://www.research.ibm.com/people/g/donnagresh
[EMAIL PROTECTED]

Reply via email to