What I think is happening here is that WordDelimiterFilterFactory is throwing away your non-alpha-numeric characters. You can see this in admin/analysis, which I've found *extremely* helpful when faced with this kind of question.
Best Erick On Tue, Dec 13, 2011 at 10:37 AM, Robert Brown <r...@intelcompute.com> wrote: > I have a field which is indexed and queried as follows: > > <tokenizer class="solr.WhitespaceTokenizerFactory"/> > > <filter class="solr.SynonymFilterFactory" synonyms="text-synonyms.txt" > ignoreCase="true" expand="true"/> > <filter class="solr.StopFilterFactory" ignoreCase="true" > words="stopwords.txt" enablePositionIncrements="true" /> > <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" > generateNumberParts="1" catenateWords="0" catenateNumbers="0" > catenateAll="0" splitOnCaseChange="1"/> > <filter class="solr.LowerCaseFilterFactory"/> > <filter class="solr.SnowballPorterFilterFactory" language="English" > protected="protwords.txt"/> > > > > When searching for "street work" (with quotes), i'm getting matches and > highlighting on things like... > > > "...Oxford <em>Street</em> (<em>Work</em> Experience)..." > > > why is this happening, and what can I do to stop it? > > I've set <int name="qs">0</int> in my config to try and avert this sort of > behaviour, am I correct in thinking that this is used to ensure there are no > words in-between the phrase words? >