The limit of 2 is hard-coded precisely because good performance for editing distances above 2 cannot be guaranteed.

-- Jack Krupansky

-----Original Message----- From: Michael Tobias
Sent: Wednesday, August 14, 2013 1:00 AM
To: java-user@lucene.apache.org
Subject: Fuzzy Searching on Lucene / Solr

My first post so please be gentle with me.

I am about to start 'playing' with Solr to see if it will be the correct
tool for a new searchable database development.  One of my requirements is
the ability to do 'fuzzy' searches and I understand that the latest versions
of Lucene / Solr use an improved version of indexing and the Levenshtein
distance formula (or rather a modified version of Levenshtein if wished for,
treating letter transpositions as a single difference rather than 2).

Levenshtein is precisely what I need, but I also understand that the maximum
distance currently implemented is a distance of just TWO.  That is not
really adequate for my purposes.  I need to be able to handle at least a
distance of 3 and probably 4.

Is the current maximum distance of 2 hard-coded in the system?  Can it be
over-ridden?  How?

I understand that performance (both indexing and querying) may be impaired
significantly by doing this but that might be a price worth paying.  If it
IS possible to change the max distance to 3 or 4 does anybody have any idea
what the performance implications might be?

Many thanks for any/all assistance you can provide.

Regards

Michael


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to