Mixing Case and Case-Insensitive Searching

Walt Stoneburner Tue, 17 Apr 2007 09:23:28 -0700

I've run into a case where we want to search for the acronym 'LET',
however this three letter word occurs very frequently in quite a
number of documents.


What I'm looking to do is a query that's case insensitive _except_ for
that specific term.

And, it appears to do so, things get very ugly, very quickly.

According to 
http://www.gossamer-threads.com/lists/lucene/java-user/28131?page=last
(I'm doing my homework first), it appears that one must keep a
case-sensitive and case-insensitive version in the index, if not two
separate indexes.

While this makes sense up to a point, allowing me to search one field
or the other; I'm looking for distances between words, it seems things
get far more complicated.

Because, should I store both versions and happen to know that the
tokens are the same and that their positions are identical, is there a
way do things like:

"LET organization"  (where LET is case sensitive, but part of the phrase)

"company LET"~10  (again, where LET is case sensitive, near the term
company which is case insensitive)

Would love to get some thoughts on how to go about approaching this.

Thanks,
-Walt Stoneburner

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Mixing Case and Case-Insensitive Searching

Reply via email to