minFuzzyLength is the length in bytes, which is wrong, I think, because it is expected to be in letters. In English the word "table" is 5 bytes, but in Russian the word "книга" is 10 bytes, though it has only 5 letters. If I have English and Russian words in one field I have to multiply minFuzzyLength by 2 if the current query has Russian letters.
Though this hack works it is wrong, because you cannot swap bytes or substitute bytes in Russian letters if you wish to guess whether it was a typo. Every arc in FST should be a letter, not a byte. -- View this message in context: http://lucene.472066.n3.nabble.com/minFuzzyLength-in-FuzzySuggester-behaves-differently-for-English-and-Russian-tp4067018.html Sent from the Lucene - General mailing list archive at Nabble.com.
