The word delimiter filter also does other things, it treats ' as
punctuation by default. So it normally splits on ', except if its 's
(in this case it removes the 's completely if you use this
stemEnglishPossessive).

There are a couple approaches you can use:
1. you can keep worddelimiterfilter with this option on, but disabling
splitting on ' by customize its type table. in this case specify
types=mycustomtypes.txt, and in that file specify ' to be treated as
ALPHANUM or similar. see
https://issues.apache.org/jira/browse/SOLR-2059 for some examples of
this. i would only do this if you want worddelimiterfilter for other
purposes, if you just want to remove possessives and don't need
worddelimiterfilter's other features, look below.
2. you can instead use EnglishPossessiveFilterFactory, which only does
this exact thing (remove 's) and nothing else.

On Wed, Oct 19, 2011 at 5:30 PM, Herman Kiefus <herm...@angieslist.com> wrote:
> We utilize a comprehensive dictionary of English words, place names, 
> surnames, male and female first names, ... you get the point.  As such, the 
> possessive plural forms of these words are recognized as 'misspelled'.
>
> I simply thought that 'turning on' this option for the WordDelimiterFactory 
> would address my concerns; however, I also got an unintended consequence: 
> Contractions (isn't, wouldn't, shouldn't, he'll, we'll...) also seem to be 
> affected.  Is this intended behavior?  When I read 'English possessive' I 
> hear 'apostrophe s' and not 'apostrophe anything'.  Is there something I'm 
> missing here?
>



-- 
lucidimagination.com

Reply via email to