Marjan Celikik a écrit :
Mathieu Lecarme wrote:

wever I don't fully understand what do you mean by "iterate over your query". I would like a conceptual answer how is this done with Lucene, not a technical one..
Your query is a tree, with BooleanQuery as branch and other query as leaf. If you wont to transforma query in "tolerant query", you have to change Term query (the leaf), with a "OR" branch with variant term as leaf.

To find variant of a term, you have to used a list of your Term and apply a filter to its to group them. Common filter for that are stemming, ngram+levenstein distance, phonetic ...

M.

OK, now it's more clear.. my final question is when is this filter information incorporated.. at index time or at search time?
both. You've got two index, one for your data, one for your Term. The second (dictionnary, lexicon ...) uses one Document per Term, and n Field for informations like ngram or phonetic. When you search a near word, you build data from the word, build a request with this data, and sort result with levenstein distance. You've got an ordered list of suggestion.
i.e. I want to know whether the levenshtein distance is computed at query time or this information is precomputed in the index?
First lucene select candidate, after you pick the best from this list. Levenstein distance is only apply is only apply on few words.

M.

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to