To be clear, analysis is not supported on StringField (or any non-tokenized field). But the good news is that by using the keyword tokenizer (KeywordTokenizer) on a TextField, you can get the same effect.

That will preserve the entire input as a single token. You may want to include filters to trim exterior white space and normalize interior white space.

-- Jack Krupansky

-----Original Message----- From: Shahak Nagiel
Sent: Tuesday, May 21, 2013 10:06 AM
To: java-user@lucene.apache.org
Subject: Case insensitive StringField?

It appears that StringField instances are treated as literals, even though my analyzer lower-cases (on both write and read sides). So, for example, I can match with a term query (e.g. "NEW YORK"), but only if the case matches. If I use a QueryParser (or MultiFieldQueryParser), it never works because these query values are lowercased and don't match.

I've found that using a TextField instead works, presumably because it's tokenized and processed correctly by the write analyzer. However, I would prefer that queries match against the entire/exact phrase ("NEW YORK"), rather than among the tokens ("NEW" or "YORK").

What's the solution here?

Thanks in advance.

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to