[ 
https://issues.apache.org/jira/browse/LUCENE-4282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir updated LUCENE-4282:
--------------------------------

    Attachment: LUCENE-4282.patch

A simpler patch, i also benchmarked.

The problem is this comment in the legacy scoring (in all previous lucene 
versions):
{noformat}
      // this will return less than 0.0 when the edit distance is
      // greater than the number of characters in the shorter word.
      // but this was the formula that was previously used in FuzzyTermEnum,
      // so it has not been changed (even though minimumSimilarity must be
      // greater than 0.0)
{noformat}

Because of that its really impossible to fix until we remove that deprecated 
one completely :)

So i think this one is good to commit, and separately I will look at removing 
the deprecated one from trunk and cleaning all this up when i have time (I 
would port the math-proof tests from automata-package to run as queries so we 
are sure).

                
> Automaton Fuzzy Query doesn't deliver all results
> -------------------------------------------------
>
>                 Key: LUCENE-4282
>                 URL: https://issues.apache.org/jira/browse/LUCENE-4282
>             Project: Lucene - Core
>          Issue Type: Bug
>          Components: core/search
>    Affects Versions: 4.0-ALPHA
>            Reporter: Johannes Christen
>            Assignee: Robert Muir
>              Labels: newbie
>         Attachments: LUCENE-4282-tests.patch, LUCENE-4282.patch, 
> LUCENE-4282.patch, ModifiedFuzzyTermsEnum.java, ModifiedFuzzyTermsEnum.java
>
>
> Having a small index with n documents where each document has one of the 
> following terms:
> WEBER, WEBE, WEB, WBR, WE, (and some more)
> The new FuzzyQuery (Automaton) with maxEdits=2 only delivers the expected 
> terms WEBER and WEBE in the rewritten query. The expected terms WEB and WBR 
> which have an edit distance of 2 as well are missing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to