I have an analyzer setup in my schema like so:

  <analyzer>
    <tokenizer class="solr.KeywordTokenizerFactory"/>
    <filter class="solr.LowerCaseFilterFactory"/>
    <filter class="solr.NGramFilterFactory" minGramSize="1"
maxGramSize="2"/>
  </analyzer>

What's happening is if I index a term like "toys and dolls", if I search for
"to", I get no matches. The debug output in solr gives me:

<str name="rawquerystring">to</str>
<str name="querystring">to</str>
<str name="parsedquery">PhraseQuery(autocomplete:"t o to")</str>
<str name="parsedquery_toString">autocomplete:"t o to"</str>

Which means it looks like the lucene query parser is turning it into a
PhraseQuery for some reason. The explain seems to confirm that this
PhraseQuery is what's causing my document to not match:

0.0 = (NON-MATCH) weight(autocomplete:"t o to" in 82), product of:
  1.0 = queryWeight(autocomplete:"t o to"), product of:
    6.684934 = idf(autocomplete: t=60 o=68 to=14)
    0.1495901 = queryNorm
  0.0 = fieldWeight(autocomplete:"t o to" in 82), product of:
    0.0 = tf(phraseFreq=0.0)
    6.684934 = idf(autocomplete: t=60 o=68 to=14)
    0.1875 = fieldNorm(field=autocomplete, doc=82)

But why? This seems like it should match to me, and indeed the Solr analysis
tool highlights the matches (see image), so something isn't lining up right.

http://lucene.472066.n3.nabble.com/file/n3116288/Screen_shot_2011-06-27_at_7.55.49_PM.png
 

In case you're wondering, I'm trying to implement a semi-advanced
autocomplete feature that goes beyond using what a simple EdgeNGram analyzer
could do.


--
View this message in context: 
http://lucene.472066.n3.nabble.com/Analyzer-creates-PhraseQuery-tp3116288p3116288.html
Sent from the Solr - User mailing list archive at Nabble.com.

Reply via email to