Re: Case Sensitivity

Brian Goetz Mon, 21 Jan 2002 15:18:00 -0800

> Wildcard queries are case sensitive, while other queries depend on the
> analyzer used for the field searched.  The standard analyzer lowercases, so
> lowercased terms are indexed.  Thus your "SPINAL CORD" query is lowercased
> and matches the indexed terms "spinal" and "cord".  However, since prefixes
> should not be stemmed they are not run through an analyzer and are hence
> case sensitive.  Your index contains no terms starting with "SPI" or "COR",
> since all terms were lowercased when indexed.
> 
> This question is frequent enough that we should probably fix it.  Perhaps a
> method should be added Analyzer:
>   public boolean isLowercased(String fieldName);
> When this is true, the query parser could lowercase prefix and range query
> terms.  Fellow Lucene developers, what do you think of that?


Something should be done, but I'm not sure this is the best way to do
this.  Perhaps extend Analyzer to work in two modes;
"tokenization-only" and "tokenization + term normalization".



--
To unsubscribe, e-mail:   <mailto:[EMAIL PROTECTED]>
For additional commands, e-mail: <mailto:[EMAIL PROTECTED]>

Re: Case Sensitivity

Reply via email to