[ 
https://issues.apache.org/jira/browse/LUCENE-3846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13221970#comment-13221970
 ] 

Robert Muir commented on LUCENE-3846:
-------------------------------------

{quote}
Next level for "fuzzy *" in Lucene is going into specifying separate costs for 
Inserts/deletes, swaps and transpositions at character(byte) level and 
optionally considering position of edit. This brings precision++ if used 
properly, like in 
{quote}

Its probably "good enough" for this suggester to allow someone to re-rank their 
top-N with a StringDistance 
[http://svn.apache.org/repos/asf/lucene/dev/trunk/modules/suggest/src/java/org/apache/lucene/search/spell/StringDistance.java]

such things are language and domain-specific and i think just adding this 
pluggability will work pretty well, rather than trying to complicate the actual 
intersection algorithm (which will ultimately never satisfy everyone anyway).

the default can be "internal metric" which means no re-ranking at all. This is 
how DirectSpellChecker works for example.
                
> Fuzzy suggester
> ---------------
>
>                 Key: LUCENE-3846
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3846
>             Project: Lucene - Java
>          Issue Type: Improvement
>            Reporter: Michael McCandless
>            Assignee: Michael McCandless
>             Fix For: 3.6, 4.0
>
>         Attachments: LUCENE-3846.patch
>
>
> Would be nice to have a suggester that can handle some fuzziness (like spell 
> correction) so that it's able to suggest completions that are "near" what you 
> typed.
> As a first go at this, I implemented 1T (ie up to 1 edit, including a 
> transposition), except the first letter must be correct.
> But there is a penalty, ie, the "corrected" suggestion needs to have a much 
> higher freq than the "exact match" suggestion before it can compete.
> Still tons of nocommits, and somehow we should merge this / make it work with 
> analyzing suggester too (LUCENE-3842).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to