fuzzy are simply not indexed.
If you wont to search quickly with fuzzy search, you should index word and their ngrams, it's the "do you mean" pattern.

you first select used word wich share ngram with the query word, the distance is computed with levenstein, and you use this word as a synonym.

M.

Le 24 nov. 07 à 17:36, Timo Nentwig a écrit :

Hi!

I search an 1.5 gig index and fuzzy queries are really slow; something like
avg. ~500ms (IndexSearcher.search(Query, HitCollector)).

When performing exact queries I archieve response times <25ms. What is it that makes fuzzy queries so slow? Increased index access due to more terms, i.e.
disk IO?

And no, my fuzzy queries (fuzzy factor 0.8) don't blow up to a boolean query
with 100s clauses but maybe something...less than 10.

Thanks
Timo

P.S. arent' there any "best practices" for lucene? Does everybody have to find out on his own (over and over again) and spend a lot of time reading and
understanding lucene's code base?

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]




---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to