[ https://issues.apache.org/jira/browse/LUCENE-5033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13676894#comment-13676894 ]
Michael McCandless commented on LUCENE-5033: -------------------------------------------- bq. I, too, was hoping to avoid calcSimilarity if raw is true, but I think we need it to calculate the boost. Let me know if I'm missing something. Ahh, you're right ... I missed that. OK. bq. The bug in the original code was that FilteredTermsEnum sets minSimilarity to 0 when the user-specified minSimilarity is >= 1.0f. So, in SlowFuzzyTermsEnum, similarity (unless it was Float.NEGATIVE_INFINITY) was typically > minSimilarity no matter its value. In other words, when the client code made the call with minSimilarity >=1.0f, that value was correctly recorded in maxEdits, but maxEdits wasn't the determining factor in whether SlowFuzzyTerms accepted a term. Oh, I see: FuzzyTermsEnum does this in its ctor, and SlowFuzzyTermsEnum extends that. Now I understand the bug ... thanks. bq. Doing an explicit levenshtein calculation here sort of defeats the entire purpose of having levenshtein automata at all! But this fix only applies in cases (edit distance > 2) where automaton's don't, I think? (The fixes are to LinearFuzzyTermsEnum). > SlowFuzzyQuery appears to fail with edit distance >=3 in some cases > ------------------------------------------------------------------- > > Key: LUCENE-5033 > URL: https://issues.apache.org/jira/browse/LUCENE-5033 > Project: Lucene - Core > Issue Type: Bug > Components: modules/other > Affects Versions: 4.3 > Reporter: Tim Allison > Priority: Minor > Attachments: LUCENE-5033.patch > > > Levenshtein edit btwn "monday" and "montugu" should be 4. The following > shows a query with "sim" set to 3, and there is a hit. > public void testFuzzinessLong2() throws Exception { > Directory directory = newDirectory(); > RandomIndexWriter writer = new RandomIndexWriter(random(), directory); > addDoc("monday", writer); > > IndexReader reader = writer.getReader(); > IndexSearcher searcher = newSearcher(reader); > writer.close(); > SlowFuzzyQuery query; > query = new SlowFuzzyQuery(new Term("field", "montugu"), 3, 0); > ScoreDoc[] hits = searcher.search(query, null, 1000).scoreDocs; > assertEquals(0, hits.length); > } -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org