Here's what we use for this:

    <fieldType name="caseInsensitiveString" class="solr.TextField" 
indexed="true" stored="true" omitNorms="true" sortMissingLast="true" 
positionIncrementGap="100">
      <analyzer>
        <tokenizer class="solr.KeywordTokenizerFactory"/>
        <filter class="solr.LowerCaseFilterFactory"/>
      </analyzer>
    </fieldType>

    <field name="someField" type="caseInsensitiveString" 
omitTermFreqAndPositions="true"/>

As far as I know, StringField does not use analyzers at all - they'll just be 
ignored.

KeywordTokenizerFactory does the "exact phrase" bit, and LowerCaseFilterFactory 
does the lowercasing.

-Michael

-----Original Message-----
From: Shahak Nagiel [mailto:snag...@yahoo.com] 
Sent: Tuesday, May 21, 2013 10:06 AM
To: java-user@lucene.apache.org
Subject: Case insensitive StringField?

It appears that StringField instances are treated as literals, even though my 
analyzer lower-cases (on both write and read sides).  So, for example, I can 
match with a term query (e.g. "NEW YORK"), but only if the case matches.  If I 
use a QueryParser (or MultiFieldQueryParser), it never works because these 
query values are lowercased and don't match.

I've found that using a TextField instead works, presumably because it's 
tokenized and processed correctly by the write analyzer.  However, I would 
prefer that queries match against the entire/exact phrase ("NEW YORK"), rather 
than among the tokens ("NEW" or "YORK").

What's the solution here?

Thanks in advance.

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to