Re: case sensitivity

Michael Kimsal Thu, 26 Apr 2007 14:56:37 -0700

I was just writing a followup.

I'm using the default text field type


   <fieldtype name="text" class="solr.TextField" positionIncrementGap="100">
     <analyzer type="index">
       <tokenizer class="solr.WhitespaceTokenizerFactory"/>
       <!-- in this example, we will only use synonyms at query time
       <filter class="solr.SynonymFilterFactory"
synonyms="index_synonyms.txt" ignoreCase="true" expand="false"/>
       -->
       <filter class="solr.StopFilterFactory" ignoreCase="true"
words="stopwords.txt"/>
       <filter class="solr.WordDelimiterFilterFactory"
generateWordParts="1" generateNumberParts="1" catenateWords="1"
catenateNumbers="1" catenateAll="0"/>
       <filter class="solr.LowerCaseFilterFactory"/>
       <filter class="solr.EnglishPorterFilterFactory"
protected="protwords.txt"/>
       <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
     </analyzer>
     <analyzer type="query">
       <tokenizer class="solr.WhitespaceTokenizerFactory"/>
       <filter class="solr.SynonymFilterFactory"
synonyms="synonyms.txt" ignoreCase="true" expand="true"/>
       <filter class="solr.StopFilterFactory" ignoreCase="true"
words="stopwords.txt"/>
       <filter class="solr.WordDelimiterFilterFactory"
generateWordParts="1" generateNumberParts="1" catenateWords="0"
catenateNumbers="0" catenateAll="0"/>
       <filter class="solr.LowerCaseFilterFactory"/>
       <filter class="solr.EnglishPorterFilterFactory"
protected="protwords.txt"/>
       <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
     </analyzer>
   </fieldtype>


That looks to me like it's got LowerCaseFilterFactory in the query analyzer
and the index analyzer.

I'm still digging in to this, but are there any other things to look for
anyone can point me to?  (Thanks Erik!)




On 4/26/07, Erik Hatcher <[EMAIL PROTECTED]> wrote:

On Apr 26, 2007, at 5:43 PM, Michael Kimsal wrote:
> I've looked through the mailing lists and can't find much of anything
> regarding case sensitivity.  It
> seems SOLR is case sensitive by default - I'm using the default
> settings
> with a very basic schema - just text fields.

All depends on the analysis you have set up for the fields.  If
you're indexing "string"-type fields in the default example schema,
there is effectively no analysis so searches must be exact matches
case and all.

> Is there any way to tell the query parser to be case insensitive
> during a
> query?  Or do I have to reindex
> all my data again with lowercase values?

Terms are indexed in a case-sensitive manner, so if you need case
insensitivity you need to lowercase on the way in and on querying.

        Erik



--
Michael Kimsal
http://webdevradio.com

Re: case sensitivity

Reply via email to