Re: Error tolerant text search with Lucene?

Mathieu Lecarme Fri, 04 Apr 2008 07:00:12 -0700

Marjan Celikik a écrit :

Mathieu Lecarme wrote:
wever I don't fully understand what do you mean by "iterate overyour query". I would like a conceptual answer how is this done withLucene, not a technical one..
Your query is a tree, with BooleanQuery as branch and other query asleaf. If you wont to transforma query in "tolerant query", you haveto change Term query (the leaf), with a "OR" branch with variantterm as leaf.
To find variant of a term, you have to used a list of your Term andapply a filter to its to group them. Common filter for that arestemming, ngram+levenstein distance, phonetic ...
M.
OK, now it's more clear.. my final question is when is this filterinformation incorporated.. at index time or at search time?

both. You've got two index, one for your data, one for your Term. Thesecond (dictionnary, lexicon ...) uses one Document per Term, and nField for informations like ngram or phonetic. When you search a nearword, you build data from the word, build a request with this data, andsort result with levenstein distance. You've got an ordered list ofsuggestion.

i.e. I want to know whether the levenshtein distance is computed atquery time or this information is precomputed in the index?

First lucene select candidate, after you pick the best from this list.Levenstein distance is only apply is only apply on few words.


M.

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: Error tolerant text search with Lucene?

Reply via email to