Which kind of Highlighter are you using ? Anyway it is responsibility of your analysis chain. it is an heavy analysis chain and I can see : "solr. HunspellStemFilterFactory"
If you are using the term vector for your field, to be used by your highlighter, in the term vector , for each document, you will find a mini inverted indexed produced by your index time analysis chain ( i.e. office will be there with the original offset and position in the text. If you are not storing the term vector and you are using an highlighter that doesn't need it, each document field will be analysed at runtime ( with the Index time analysis chain). So your guess is correct, with such an heavy analysed field , highlighting will work in that way. Cheers 2015-07-14 14:58 GMT+01:00 Mike Thomsen <mikerthom...@gmail.com>: > For the query "police office" our users are getting back highlighted > results for "police office*r*" (and "police office*rs*") I get why a search > for police officers would include just "office" since the stemmer would > cause that behavior. However I don't understand why "office" is matching > "officer" here when no fuzzy matching is being done. Is that also a result > of our stemmer? > > Here's the text field we're using: > > <fieldType name="text_en_splitting" class="solr.TextField" > positionIncrementGap="100" autoGeneratePhraseQueries="true"> > <analyzer type="index"> > <tokenizer class="solr.StandardTokenizerFactory"/> > <filter class="solr.LowerCaseFilterFactory"/> > <filter class="solr.WordDelimiterFilterFactory" > generateWordParts="1" generateNumberParts="1" catenateWords="1" > catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"/> > <filter class="solr.ManagedStopFilterFactory" > managed="english"/> > <filter class="solr.KeywordMarkerFilterFactory" > protected="protwords.txt"/> > <filter class="solr.HunspellStemFilterFactory" > dictionary="en_US.dic" > affix="en_US.aff" > ignoreCase="false" > longestOnly="false" /> > <filter class="solr.PhoneticFilterFactory" encoder="RefinedSoundex" > inject="true"/> > <filter class="solr.RemoveDuplicatesTokenFilterFactory"/> > </analyzer> > <analyzer type="query"> > <tokenizer class="solr.StandardTokenizerFactory"/> > <filter class="solr.LowerCaseFilterFactory"/> > <filter class="solr.WordDelimiterFilterFactory" > generateWordParts="1" generateNumberParts="1" catenateWords="1" > catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"/> > <filter class="solr.ManagedSynonymFilterFactory" managed="english" > /> > <filter class="solr.ManagedStopFilterFactory" > managed="english"/> > <filter class="solr.KeywordMarkerFilterFactory" > protected="protwords.txt"/> > <filter class="solr.HunspellStemFilterFactory" > dictionary="en_US.dic" > affix="en_US.aff" > ignoreCase="false" > longestOnly="false" /> > <filter class="solr.PhoneticFilterFactory" encoder="RefinedSoundex" > inject="true"/> > <filter class="solr.RemoveDuplicatesTokenFilterFactory"/> > </analyzer> > </fieldType> > > Thanks, > > Mike > -- -------------------------- Benedetti Alessandro Visiting card - http://about.me/alessandro_benedetti Blog - http://alexbenedetti.blogspot.co.uk "Tyger, tyger burning bright In the forests of the night, What immortal hand or eye Could frame thy fearful symmetry?" William Blake - Songs of Experience -1794 England