I'd make it easy for myself. Generate (programmatically), a list like you showed for a _lot_ more terms, send it to your customer, and let _them_ pick. Unfortunately, the customer has no idea what "aggressive" means (for that matter, I don't know how porter handles specific words for that matter, I always have to try it). By putting concrete examples in front of them, and framing it with "all the words that reduce to the same stem will be considered matches and return" you can give them enough info to make a choice.
FWIW, Erick On Wed, Nov 14, 2012 at 9:11 PM, Jack Krupansky <j...@basetechnology.com>wrote: > Another word set to try: invest, investing, investment, investments, > invests, investor, invester, investors, investers. > > Also, take a look at EnglishMinimalStemmer (** > EnglishMinimalStemFilterFactor**y) for minimal stemming. > > See: > http://lucene.apache.org/core/**4_0_0/analyzers-common/org/** > apache/lucene/analysis/en/**EnglishMinimalStemFilterFactor**y.html<http://lucene.apache.org/core/4_0_0/analyzers-common/org/apache/lucene/analysis/en/EnglishMinimalStemFilterFactory.html> > http://lucene.apache.org/core/**4_0_0/analyzers-common/org/** > apache/lucene/analysis/en/**EnglishMinimalStemmer.html<http://lucene.apache.org/core/4_0_0/analyzers-common/org/apache/lucene/analysis/en/EnglishMinimalStemmer.html> > > > -- Jack Krupansky > > -----Original Message----- From: Scott Smith > Sent: Wednesday, November 14, 2012 5:17 PM > To: java-user@lucene.apache.org > Subject: RE: Which stemmer? > > > Unfortunately, my "use case" is a customer who wants stemming, but has > very little knowledge of what that means except they think they want it. > > I agree with your last comment. So, here's my contribution: > > Original porter kstem minStem > ------- ------- ------- ------- > country countri country country > run run run run > runs run runs run > running run running running > read read read read > reading read reading reading > reader reader reader reader > association associ association association > associate associ associate associate > listing list list listing > water water water water > watered water water watered > sure sure sure sure > surely sure surely surely > fred's fred' fred's fred' > roses rose rose rose > > Still not sure which one to pick. Porter is more aggressive. Min stemmer > is pretty minimal. Perhaps the kstemmer is "just right" :-) > > Cheers > > Scott > > -----Original Message----- > From: Jack Krupansky > [mailto:jack@basetechnology.**com<j...@basetechnology.com> > ] > Sent: Wednesday, November 14, 2012 4:14 PM > To: java-user@lucene.apache.org > Subject: Re: Which stemmer? > > What is your use case? If you don't have a specific use case in mind, try > each of them with some common words that you expect will or won't be > stemmed. If you have Solr, you can experiment interactively using the Solr > Admin Analysis web page. > > It would be nice if the javadoc for each stemmer gave a handful of > examples that illustrated how some common words are stemmed. > > -- Jack Krupansky > > -----Original Message----- > From: Scott Smith > Sent: Wednesday, November 14, 2012 10:55 AM > To: java-user@lucene.apache.org > Subject: Which stemmer? > > Does anyone have any experience with the stemmers? I know that Porter is > what "everyone" uses. Am I better off with KStemFilter (better > performance) or ?? Does anyone understand the differences between the > various stemmers and how to choose one over another? > > > ------------------------------**------------------------------**--------- > To unsubscribe, e-mail: > java-user-unsubscribe@lucene.**apache.org<java-user-unsubscr...@lucene.apache.org> > For additional commands, e-mail: > java-user-help@lucene.apache.**org<java-user-h...@lucene.apache.org> > > > ------------------------------**------------------------------**--------- > To unsubscribe, e-mail: > java-user-unsubscribe@lucene.**apache.org<java-user-unsubscr...@lucene.apache.org> > For additional commands, e-mail: > java-user-help@lucene.apache.**org<java-user-h...@lucene.apache.org> > > ------------------------------**------------------------------**--------- > To unsubscribe, e-mail: > java-user-unsubscribe@lucene.**apache.org<java-user-unsubscr...@lucene.apache.org> > For additional commands, e-mail: > java-user-help@lucene.apache.**org<java-user-h...@lucene.apache.org> > >