If I index the following text: "I live in Dublin Ireland where
Guinness is brewed"

Then search for: duvlin

Should Solr return a match?

In the admin interface under the analysis section, Solr highlights
some NGram matches?

When I enter the following query string into my browser address bar, I
get 0 results?

http://localhost:8983/solr/select/?q=duvlin&debugQuery=true

Nor do I get results for dub, dubli, ublin, dublin (du does return a result).

I also notice when I use debugQuery=true, the parsed query is a
PhraseQuery. This doesn't make sense to me, as surely the point of the
NGram is to use a Boolean OR between each Gram??

However, if I don't use an NGramFilterFactory at query time, I can get
results for: dub, ublin, du, but not duvlin.

<fieldType name="text" class="solr.TextField">
      <analyzer type="index">
        <tokenizer class="solr.StandardTokenizerFactory"/>
        <filter class="solr.LowerCaseFilterFactory"/>
        <filter class="solr.NGramFilterFactory" minGramSize="2"
maxGramSize="15"/>
      </analyzer>
</fieldType>

Can someone please clarify what the purpose of the
NGramFilter/tokenizer is, if not to allow for
misspellings/morphological variation and also, what the correct
configuration is in terms of use at index/query time.

Any help appreciated!

Aodh.

Solr 1.3, JDK 1.6

Reply via email to