Re: leading and trailing wildcard query

Walter Underwood Thu, 05 Nov 2009 15:26:18 -0800

Note that N-grams are limited to specific string lengths. I presumethat you need to search for arbitrary strings, not just three-letterones.


wunder


On Nov 5, 2009, at 3:23 PM, Bernadette Houghton wrote:

Hi Steve, a query such as *abc* would need the NGramFilterFactor,hence the doubleedgytext, and would be retrievable by a query suchas contains:abc. Note that you can set the max and minimum size ofstrings that get indexed.


bern

-----Original Message-----
From: A. Steven Anderson [mailto:a.steven.ander...@gmail.com]
Sent: Friday, 6 November 2009 10:08 AM
To: solr-user@lucene.apache.org
Subject: Re: leading and trailing wildcard query

Thanks for the solution, but could you elaborate on how it would find
something like *abc* in a field that contains xxxxabcxxxx.

Steve

On Thu, Nov 5, 2009 at 5:25 PM, Bernadette Houghton <
bernadette.hough...@deakin.edu.au> wrote:

I've just set up something similar (much thanks to Avesh!)-

<fieldType name="edgytext" class="solr.TextField"
positionIncrementGap="100">
<analyzer type="index">
 <tokenizer class="solr.KeywordTokenizerFactory"/>
 <filter class="solr.LowerCaseFilterFactory"/>
 <filter class="solr.EdgeNGramFilterFactory" minGramSize="5"
maxGramSize="25" />
</analyzer>
<analyzer type="query">
 <tokenizer class="solr.KeywordTokenizerFactory"/>
 <filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
</fieldType>

<fieldType name="doubleedgytext" class="solr.TextField"
positionIncrementGap="100">
<analyzer type="index">
 <tokenizer class="solr.KeywordTokenizerFactory"/>
 <filter class="solr.LowerCaseFilterFactory"/>

<filter class="solr.NGramFilterFactory" minGramSize="5"maxGramSize="25"

/>
</analyzer>
<analyzer type="query">
 <tokenizer class="solr.KeywordTokenizerFactory"/>
 <filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
</fieldType>
.
.

<field name="beginswith" type="edgytext" indexed="true"stored="false"

multiValued="true"/>
 <field name="contains" type="doubleedgytext" indexed="true"
stored="false" multiValued="true"/>
.
.
 <!-- Copy for BEGINSWITH search -->
 <copyField source="content" dest="beginswith"/>
 <copyField source="*_t" dest="beginswith"/>
 <copyField source="*_mt" dest="beginswith"/>

 <!-- Copy for CONTAINS search -->
 <copyField source="content" dest="contains"/>
 <copyField source="*_t" dest="contains"/>
 <copyField source="*_mt" dest="contains"/>

bern

Re: leading and trailing wildcard query

Reply via email to