Re: Lowercasing wildcards - why?

2003-05-31 Thread David_Birthwell
True enough. We're supporting search of a product database, so, for us, it made sense to increase coverage and accept the loss of precision. Our solution is definitely not globally applicable. DaveB

Re: Lowercasing wildcards - why?

2003-05-31 Thread David_Birthwell
Your analyzers can optionally incorporate stemming, along with the other things that analyzers do (lowercasing, etc...). The stemming algorithms are all different. This "searcher" example was made up, but, there are instances where stemming at index time and not stemming wildcard searches will r

Re: Search for similar terms

2003-05-30 Thread David_Birthwell
Perform the lucene search. If you get no or few hits, send the query term to a spell checker, like ispell. Echo the alternative spelling(s) to the user. DaveB "Dario

Re: Lowercasing wildcards - why?

2003-05-30 Thread David_Birthwell
Hi Les, We ended up modifying the QueryParser to pass prefix and suffix queries through the Analyzer. For us, it was about stemming. If you decide to use an analyzer that incorporated stemming, there are cases where wildcard queries will not return the expected results. Example: "searcher" wi

Re: not fetching all results that has same file name

2003-05-29 Thread David_Birthwell
It all depends on how you are indexing these documents. What gets put in the DocName field? What gets put in the Type field? Are you using third party code to perform the indexing? DaveB

Re: too many hits - OutOfMemoryError

2003-05-29 Thread David_Birthwell
Unfortunately, no. The modifications are not very extreme, though. If you're interested in seeing our approach, let me know. DaveB "Eric Jain"

Re: too many hits - OutOfMemoryError

2003-05-29 Thread David_Birthwell
Cory, When performing wildcard queries, the bulk of the memory is used during wildcard term expansion. The memory requirement is proportional to the number of matching terms, not the number of hits. You should make sure you are using the latest Lucene. There was a fix in 1.3 to reduce the memo