[ 
https://issues.apache.org/jira/browse/SOLR-2982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13812912#comment-13812912
 ] 

Thomas Champagne commented on SOLR-2982:
----------------------------------------

I noticed too the bad performance of Beider Morse encoder. So, I have created 
an issue CODEC-174 in the commons-codec project to improve the performance. 
Currently, I have created two patches that allow dividing the encoding time by 
2. 
If you want a better Beider Morse encoder, you can join us on the issue 
CODEC-174 :)

> Upgrade Apache Commons Codec to version 1.6 in order to add new Beider-Morse 
> Phonetic Matching (BMPM) option
> ------------------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-2982
>                 URL: https://issues.apache.org/jira/browse/SOLR-2982
>             Project: Solr
>          Issue Type: Improvement
>          Components: Rules, Schema and Analysis, search
>            Reporter: Brooke Schreier Ganz
>              Labels: codec, commons, commons-codec, language, names, 
> phonetic, search, searching, soundalike
>             Fix For: 3.6, 4.0-ALPHA
>
>         Attachments: SOLR-2982.patch
>
>
> Apache Commons Codec released version 1.6 of their codec pack in November, 
> 2011.  Along with a few bug fixes, 1.6 contains a great new phonetic matching 
> system called Beider-Morse Phonetic Matching (BMPM) that is far superior to 
> the existing phonetic codecs, such as regular soundex, metaphone, caverphone, 
> and so on.  BMPM has actually been available for some time, but this is the 
> first port of it to java, and its first commit in the Apache ecosystem.
> For a lot more information, see here: http://stevemorse.org/phoneticinfo.htm  
>  and  http://stevemorse.org/phonetics/bmpm.htm
> BMPM would be a fantastic "soundalike" tool to help search for personal names 
> (or just surnames) in a Solr/Lucene index, much better than Levenshtein 
> distance for this use case.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to