Well, I fixed my own problem in the end. For the record, this is the schema I ended up going with:
<fieldType name="text_bigram" class="solr.TextField" omitNorms="true"> <analyzer> <tokenizer class="solr.WhitespaceTokenizerFactory" /> <filter class="solr.LowerCaseFilterFactory" /> <filter class="solr.NGramFilterFactory" maxGramSize="2" minGramSize="2" /> </analyzer> <analyzer type="query"> <tokenizer class="solr.WhitespaceTokenizerFactory" /> <filter class="solr.LowerCaseFilterFactory" /> <filter class="solr.NGramFilterFactory" maxGramSize="2" minGramSize="2"/> </analyzer> </fieldType> I could have left it a trigram but went with a bigram because with this setup, I can get queries to properly hit as long as the min/max gram size is met. In other words, for any queries two or more characters long, this works for me. Less than two characters and it fails. I don't know exactly why that is, but I'll take it anyway! - Charlie -----Original Message----- From: Charlie Jackson [mailto:charlie.jack...@cision.com] Sent: Friday, October 23, 2009 10:00 AM To: solr-user@lucene.apache.org Subject: NGram query failing I have a requirement to be able to find hits within words in a free-form id field. The field can have any type of alphanumeric data - it's as likely it will be something like "123456" as it is to be "SUN-123-ABC". I thought of using NGrams to accomplish the task, but I'm having a problem. I set up a field like this <fieldType name="text_trigram" class="solr.TextField" positionIncrementGap="100"> <analyzer> <tokenizer class="solr.NGramTokenizerFactory" minGramSize="1" maxGramSize="3"/> <filter class="solr.LowerCaseFilterFactory"/> </analyzer> </fieldType> After indexing a field like this, the analysis page indicates my queries should work. If I give it a sample field value of "ABC-123456-SUN" and a query value of "45" it shows hits in several places, which is what I expected. However, when I actually query the field with something like "45" I get no hits back. Looking at the debugQuery output, it looks like it's taking my analyzed query text and putting it into a phrase query. So, for a query of "45" it turns into a phrase query of <field>:"4 5 45" which then doesn't hit on anything in my index. What am I missing to make this work? - Charlie