It depends on the order of the filters in your Analyzer. You would want
to be sure you put the StopWord filter before the Stemming filter. The
reason that the MoreLikeThis class does not do as you want is that first
it applies the Analyzer (which stems) and THEN it applies its custom
stop word
I wasn't sure this:
Instead add the stopwords to the analyzer that
> you pass to MoreLikeThis. That way you can ensure that the analyzer
> applies the stopword list before stemming
would work, because I don't want to provide all the variants of the
stopword list-- if I do this, only the one pr
Sounds right to me.
The other option I think you have is to not use the MoreLikeThis
stopword functionality. Instead add the stopwords to the analyzer that
you pass to MoreLikeThis. That way you can ensure that the analyzer
applies the stopword list before stemming (The MoreLikeThis stopword
Could those "in the know" comment on my current understanding of stemming
and stopwords using the snowball analyzer?
In my application, I am using the MoreLikeThis class to find similar
documents to an input "text blob". There are words in the input text blob
which are "uninteresting" for my ap