Re: Lowercasing wildcards - why?

2003-05-31 Thread David_Birthwell
Hi Les, We ended up modifying the QueryParser to pass prefix and suffix queries through the Analyzer. For us, it was about stemming. If you decide to use an analyzer that incorporated stemming, there are cases where wildcard queries will not return the expected results. Example: searcher

Re: Search for similar terms

2003-05-31 Thread David_Birthwell
Perform the lucene search. If you get no or few hits, send the query term to a spell checker, like ispell. Echo the alternative spelling(s) to the user. DaveB Dario

Re: Lowercasing wildcards - why?

2003-05-31 Thread Leo Galambos
I'm sorry, I did not read the complete thread. Do you mean - analyzer == stemmer? Does it really work? If I was a stemmer, I would let searche intact. ;-) -g- [EMAIL PROTECTED] wrote: Hi Les, We ended up modifying the QueryParser to pass prefix and suffix queries through the Analyzer. For

Re: Lowercasing wildcards - why?

2003-05-31 Thread David_Birthwell
Your analyzers can optionally incorporate stemming, along with the other things that analyzers do (lowercasing, etc...). The stemming algorithms are all different. This searcher example was made up, but, there are instances where stemming at index time and not stemming wildcard searches will

Re: Lowercasing wildcards - why?

2003-05-31 Thread Leo Galambos
Ah, I got it. THX. In the good old days, the wildcards were used as a fix for missing stemming module. I am not sure if you can combine these two opposite approaches successfully. I see the following drawbacks of your solution. Example: built* (-built) could be changed to build* (no built, but

Re: Lowercasing wildcards - why?

2003-05-31 Thread David_Birthwell
True enough. We're supporting search of a product database, so, for us, it made sense to increase coverage and accept the loss of precision. Our solution is definitely not globally applicable. DaveB

Re: Search for similar terms

2003-05-31 Thread Dario Dentale
Thanks, for the answer. I was searching for a solution not based on a dictionary, but on the list of terms (with relative frequency) contained in the Lucene index. In this way (I think) I can obtain more significant results, I can use this method on multiple languages (without relative

Re: Search for similar terms

2003-05-31 Thread Dario Dentale
Hi, can you suffer me a link with an overview document of this method? I couldn't find. Thanks, Dario - Original Message - From: Leo Galambos [EMAIL PROTECTED] To: Lucene Users List [EMAIL PROTECTED] Sent: Friday, May 30, 2003 4:25 PM Subject: Re: Search for similar terms You need

Re: Search for similar terms

2003-05-31 Thread Leo Galambos
http://cs.felk.cvut.cz/psc/members.html http://cs.felk.cvut.cz/psc/event/1998/p13.html or contact prof. Melichar for more details: http://webis.felk.cvut.cz/people/melichar.html -g- Dario Dentale wrote: Hi, can you suffer me a link with an overview document of this method? I couldn't find.

AW: Search for similar terms

2003-05-31 Thread Karsten Konrad
Hi, please have a look at the FuzzyTermEnum class in Lucene. There is an impressive implementation of Levenshtein distance there that you can use; simply set the fuzzy distance higher than 0.5 (0.75 seems to work fine) and modify the termCompare method such that the last term produced is

search item with '-' in it

2003-05-31 Thread Lixin Meng
I have a field, 'PartNumber', that has '-' in its value (e.g. SG-XRRH-C1M0-A). After indexing, I can perform certain queries. However, I feel confused to explain the behavior. - if searching for PartNumber:SG it will return multiple hits. I assume the anaylzer might take out '-'. - if

Re: Lowercasing wildcards - why?

2003-05-31 Thread Tatu Saloranta
On Friday 30 May 2003 09:55, Leo Galambos wrote: Ah, I got it. THX. In the good old days, the wildcards were used as a fix for missing stemming module. I am not sure if you can combine these two opposite approaches successfully. I see the following drawbacks of your solution. Example:

DBDirectory

2003-05-31 Thread Anthony Eden
I found some references to an SQLDirectory class in the mailing list archives but I was unable to actually locate the package anywhere in the CVS (I looked in both the primary and the sandbox) nor could I find it in Google. Anyhow, I have written my own implementation of a database-backed

API to perform full-text search

2003-05-31 Thread Venkatraman, Shiv
Which API do I use to perform full-text search? The APIs in Javadoc also refer to a field name. In my case, I would like to do a search (e.g. bob) across all documents, irrespective of the field name. - To unsubscribe, e-mail: