Re: Case Sensitivity

2002-01-21 Thread Brian Goetz
> I have noticed that I can not search using capital letters for some reason. > If I try to do a search on "SPINAL CORD" and if I use a query like SPI* AND > COR*, I get no results back. If I use lowercase (spi* AND cor*) however, I > get the results back. I am using a standard analyzer. Does anyo

RE: Case Sensitivity

2002-01-21 Thread Doug Cutting
Wildcard queries are case sensitive, while other queries depend on the analyzer used for the field searched. The standard analyzer lowercases, so lowercased terms are indexed. Thus your "SPINAL CORD" query is lowercased and matches the indexed terms "spinal" and "cord". However, since prefixes

Re: Case Sensitivity

2002-01-21 Thread Brian Goetz
> Wildcard queries are case sensitive, while other queries depend on the > analyzer used for the field searched. The standard analyzer lowercases, so > lowercased terms are indexed. Thus your "SPINAL CORD" query is lowercased > and matches the indexed terms "spinal" and "cord". However, since p

RE: Case Sensitivity

2002-04-03 Thread Aruna Raghavan
Hi, I worked around the problem by converting everything to lowercase in my code prior to indexing into lucene and also prior to searching for a string. Ofcourse, I also had to use pattern matching to change bool operators such as ANDs and ORs to uppercase again because lucene expects those to be

RE: Case Sensitivity

2002-04-03 Thread Joshua O'Madadhain
Alan, Aruna: The built-in solution is to use LowerCaseFilter in your Analyzer. (The SimpleAnalyzer, StopAnalyzer, and StandardAnalyzer classes already do this; see the Lucene API docs to see which filters each uses.) The FAQ includes an example implementation of an Analyzer if you want to build

RE: Case Sensitivity

2002-04-03 Thread Aruna Raghavan
dhain [mailto:[EMAIL PROTECTED]] Sent: Wednesday, April 03, 2002 1:40 PM To: Lucene Users List Subject: RE: Case Sensitivity Alan, Aruna: The built-in solution is to use LowerCaseFilter in your Analyzer. (The SimpleAnalyzer, StopAnalyzer, and StandardAnalyzer classes already do this; see the Lucene API

Re: Case Sensitivity

2002-04-03 Thread Peter Carlson
You can use the standard analyzer. This lower cases all the words (it uses the lowerCaseFilter). Note this also uses the stop word filter so your results may vary. Also when you index, be sure to use text instead of keyword as the field type since the keyword doesn't go through the filter. --Pe

Re: case sensitivity confusion

2003-02-04 Thread Otis Gospodnetic
Searches are case insensitive. Showing your indexing and your searching code would help. Prefix queries are case sensitive. Otis --- Stephen Eaton <[EMAIL PROTECTED]> wrote: > G'day One and all, > > I'm have some problems getting lucene to do case-insensitive > searches. I > have looked throu

RE: case sensitivity confusion

2003-02-04 Thread Sale, Doug
are you running the query through the standard analyzer? -doug > -Original Message- > From: Stephen Eaton [mailto:[EMAIL PROTECTED]] > Sent: Tuesday, February 04, 2003 8:14 AM > To: [EMAIL PROTECTED] > Subject: case sensitivity confusion > > > G'day One and all, > > I'm have some probl

Re: Case Sensitivity - and more

2002-01-22 Thread Michal Plechawski
plications. Regards, Michal - Original Message - From: "Brian Goetz" <[EMAIL PROTECTED]> To: "Lucene Users List" <[EMAIL PROTECTED]> Sent: Tuesday, January 22, 2002 12:12 AM Subject: Re: Case Sensitivity > > Wildcard queries are case sensitive, w

RE: Case Sensitivity - and more

2002-01-24 Thread Doug Cutting
> From: Michal Plechawski > > I think that Brian's idea is more flexible and extendable. In my > application, I need three or more kinds of analyzers: for > counting tfidf > statistics, for indexing (compute more, e.g. summaries) and > for document > classification (compute document-to-class ass

Re: Case Sensitivity - and more

2002-01-25 Thread Michal Plechawski
> Currently it is easy to use different analyzers for different purposes, no? > I'm not sure how Brian's proposal (bi-modal analyzers: tokenize only & > tokenize+normalize) addresses your needs. Ok, maybe I misled a point a bit. But Brian's proposal as I see it was to _group_ two tokenizers that

Re: Case Sensitivity - and more

2002-01-25 Thread Brian Goetz
> Ok, maybe I misled a point a bit. But Brian's proposal as I see it was to > _group_ two tokenizers that differ in a single thing. I don't think that's what I was proposing... I was recognizing that sometimes the analysis process is a composite one, and I was advocating that the composition be

Re: Case Sensitivity - and more

2002-01-25 Thread Michal Plechawski
different token streams in 1.2, that was the real problem in 1.0. Michal - Original Message - From: "Brian Goetz" <[EMAIL PROTECTED]> To: "Lucene Users List" <[EMAIL PROTECTED]> Sent: Friday, January 25, 2002 11:24 AM Subject: Re: Case Sensitivity - and more