Note that N-grams are limited to specific string lengths. I presume
that you need to search for arbitrary strings, not just three-letter
ones.
wunder
On Nov 5, 2009, at 3:23 PM, Bernadette Houghton wrote:
Hi Steve, a query such as *abc* would need the NGramFilterFactor,
hence the doubleedgytext, and would be retrievable by a query such
as contains:abc. Note that you can set the max and minimum size of
strings that get indexed.
bern
-----Original Message-----
From: A. Steven Anderson [mailto:a.steven.ander...@gmail.com]
Sent: Friday, 6 November 2009 10:08 AM
To: solr-user@lucene.apache.org
Subject: Re: leading and trailing wildcard query
Thanks for the solution, but could you elaborate on how it would find
something like *abc* in a field that contains xxxxabcxxxx.
Steve
On Thu, Nov 5, 2009 at 5:25 PM, Bernadette Houghton <
bernadette.hough...@deakin.edu.au> wrote:
I've just set up something similar (much thanks to Avesh!)-
<fieldType name="edgytext" class="solr.TextField"
positionIncrementGap="100">
<analyzer type="index">
<tokenizer class="solr.KeywordTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.EdgeNGramFilterFactory" minGramSize="5"
maxGramSize="25" />
</analyzer>
<analyzer type="query">
<tokenizer class="solr.KeywordTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
</fieldType>
<fieldType name="doubleedgytext" class="solr.TextField"
positionIncrementGap="100">
<analyzer type="index">
<tokenizer class="solr.KeywordTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.NGramFilterFactory" minGramSize="5"
maxGramSize="25"
/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.KeywordTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
</fieldType>
.
.
<field name="beginswith" type="edgytext" indexed="true"
stored="false"
multiValued="true"/>
<field name="contains" type="doubleedgytext" indexed="true"
stored="false" multiValued="true"/>
.
.
<!-- Copy for BEGINSWITH search -->
<copyField source="content" dest="beginswith"/>
<copyField source="*_t" dest="beginswith"/>
<copyField source="*_mt" dest="beginswith"/>
<!-- Copy for CONTAINS search -->
<copyField source="content" dest="contains"/>
<copyField source="*_t" dest="contains"/>
<copyField source="*_mt" dest="contains"/>
bern