On Aug 22, 2005, at 11:43 AM, Rajesh Munavalli wrote:
At the query time I was thinking of two queries ORed toegether. One
with
user entered query and the other case insensitive query. For example:
The user query "Java Virtual machine" would be translated into
"Java Virtual machine" OR "java virtual machine".
Eventhough the user mistyped the case ("machine" instead of
"Machine"),
the query would retrieve documents. I am not sure about the
performance
though. Erik would be the right person to help us understand
performance
constraints in doing so.
Performance issues aside (as they aren't really a factor in this
equation), I still don't understand what good ORing queries with
various cases does in the scenario you describe. If you're going to
index lower case, and then OR in the lowercase, then there is really
no need to even have the case-sensitive parts there in the index or
in the query.
Erik
Rajesh Munavalli
-----Original Message-----
From: Erik Hatcher [mailto:[EMAIL PROTECTED]
Sent: Monday, August 22, 2005 10:20 AM
To: java-user@lucene.apache.org
Subject: Re: Case-sensitive search
On Aug 22, 2005, at 11:10 AM, Rajesh Munavalli wrote:
You could also treat the case-sensitive and case-insensitive as
Synonyms and index them at the same position. This would be
helpful in
phrase queries.
You wouldn't be able to selectively toggle between
case-sensitive and -insensitive searching this way, though,
so I'm not sure if there is merit in doing both cases in the
same position or not.
Erik
Rajesh Munavalli
-----Original Message-----
From: Erik Hatcher [mailto:[EMAIL PROTECTED]
Sent: Monday, August 22, 2005 10:04 AM
To: java-user@lucene.apache.org
Subject: Re: Case-sensitive search
On Aug 22, 2005, at 10:40 AM, [EMAIL PROTECTED] wrote:
Is there any way to index as case-sensitive and then, while
searching,
making the search case-sensitive and case-insensitive using
the same
index as needed?
Not really. Terms in the index are ordered lexicographically,
including case. It certainly would be possible to write
customized
Query subclasses to do this sort of thing at the expense of
performance.
The only techniques I'm aware of are to either build
separate indexes
or index the same information into separate fields of the same
documents using different analyzers per field.
Erik
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]