Note that N-grams are limited to specific string lengths. I presume that you need to search for arbitrary strings, not just three-letter ones.

wunder

On Nov 5, 2009, at 3:23 PM, Bernadette Houghton wrote:

Hi Steve, a query such as *abc* would need the NGramFilterFactor, hence the doubleedgytext, and would be retrievable by a query such as contains:abc. Note that you can set the max and minimum size of strings that get indexed.

bern

-----Original Message-----
From: A. Steven Anderson [mailto:a.steven.ander...@gmail.com]
Sent: Friday, 6 November 2009 10:08 AM
To: solr-user@lucene.apache.org
Subject: Re: leading and trailing wildcard query

Thanks for the solution, but could you elaborate on how it would find
something like *abc* in a field that contains xxxxabcxxxx.

Steve

On Thu, Nov 5, 2009 at 5:25 PM, Bernadette Houghton <
bernadette.hough...@deakin.edu.au> wrote:

I've just set up something similar (much thanks to Avesh!)-

<fieldType name="edgytext" class="solr.TextField"
positionIncrementGap="100">
<analyzer type="index">
 <tokenizer class="solr.KeywordTokenizerFactory"/>
 <filter class="solr.LowerCaseFilterFactory"/>
 <filter class="solr.EdgeNGramFilterFactory" minGramSize="5"
maxGramSize="25" />
</analyzer>
<analyzer type="query">
 <tokenizer class="solr.KeywordTokenizerFactory"/>
 <filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
</fieldType>

<fieldType name="doubleedgytext" class="solr.TextField"
positionIncrementGap="100">
<analyzer type="index">
 <tokenizer class="solr.KeywordTokenizerFactory"/>
 <filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.NGramFilterFactory" minGramSize="5" maxGramSize="25"
/>
</analyzer>
<analyzer type="query">
 <tokenizer class="solr.KeywordTokenizerFactory"/>
 <filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
</fieldType>
.
.
<field name="beginswith" type="edgytext" indexed="true" stored="false"
multiValued="true"/>
 <field name="contains" type="doubleedgytext" indexed="true"
stored="false" multiValued="true"/>
.
.
 <!-- Copy for BEGINSWITH search -->
 <copyField source="content" dest="beginswith"/>
 <copyField source="*_t" dest="beginswith"/>
 <copyField source="*_mt" dest="beginswith"/>

 <!-- Copy for CONTAINS search -->
 <copyField source="content" dest="contains"/>
 <copyField source="*_t" dest="contains"/>
 <copyField source="*_mt" dest="contains"/>

bern


Reply via email to