Does Solr support Soundex? (Soundex was originally developed to assist with alternate spellings of names)
Keith On Mon, Jun 13, 2011 at 8:08 PM, Jonathan Rochkind <rochk...@jhu.edu> wrote: > In a Solr-based search, stemming is done at indexing time, into fields with > stemmed tokens. > > It seems typical in library-catalog type applications based on Solr to have > the default (or even only) searches be over these stemmed fields, thus > 'auto-stemming' to the user. (Search for 'monkey', find 'monkeys' too, and > vice versa). > > I am curious how many people, who have Solr based catalogs (that is, I'm > interested in people who have search engines with majority or only content > originally from MARC), use such stemmed fields ('auto-stemming') over their > _author_ fields as well. > > There are pro's and con's to this. There are certainly some things in an > author field that would benefit from stemming (mostly various kinds of > corporate authors, some of whose endings end up looking like english language > phrases). There are also very many things in an author field that would not > benefit from stemming, and thus when stemming is done it sometimes(/often?) > results in false matches, "pluralizing" an author's last name in an > inappropriate way for instance. > > So, wanna say on the list, if you are using a Solr-based catalog, are you > using stemmed fields for your author searches? Curious what people end up > doing. If there are any other more complicated clever things you've done > than just stem-or-not, let us know that too! > > Jonathan >