If your use-case is limited to this, why don't you encapsulate all queries in double quotes?
On Wednesday 01 September 2010 14:21:47 Jeff Rose wrote: > Hi, > We are using SOLR to match query strings with a keyword database, where > some of the keywords are actually more than one word. For example a > keyword might be "apple pie" and we only want it to match for a query > containing that word pair, but not one only containing "apple". Here is > the relevant piece of the schema.xml, defining the index and query > pipelines: > > <fieldType name="text" class="solr.TextField" positionIncrementGap="100"> > <analyzer type="index"> > <tokenizer class="solr.PatternTokenizerFactory" pattern=";"/> > <filter class="solr.LowerCaseFilterFactory"/> > <filter class="solr.TrimFilterFactory" /> > </analyzer> > <analyzer type="query"> > <tokenizer class="solr.WhitespaceTokenizerFactory"/> > <filter class="solr.LowerCaseFilterFactory"/> > <filter class="solr.TrimFilterFactory" /> > <filter class="solr.ShingleFilterFactory" /> > </analyzer> > </fieldType> > > In the analysis tool this schema looks like it works correctly. Our > multi-word keywords are indexed as a single entry, and then when a search > phrase contains one of these multi-word keywords it is shingled and > matched. Unfortunately, when we do the same queries on top of the actual > index it responds with zero matches. I can see in the index histogram > that the terms are correctly indexed from our mysql datasource containing > the keywords, but somehow the shingling doesn't appear to work on this > live data. Does anyone have experience with shingling that might have > some tips for us, or otherwise advice for debugging the issue? > > Thanks, > Jeff > Markus Jelsma - Technisch Architect - Buyways BV http://www.linkedin.com/in/markus17 050-8536620 / 06-50258350