[ https://issues.apache.org/jira/browse/CODEC-199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Yossi Tamari reopened CODEC-199: -------------------------------- In the first patch I submitted I tried to localize the changes to reduce risk. Having thought about it since, I have a better patch which I think is more efficient (less map lookups), more correct (the HW rule is specific to the US English mapping, but it was implemented in the main code, I fixed this by defining a new mapping character of '#' that marks a silent letter, and mapping H and W to it), and I think results in simpler code. Patch attached as better.patch. > Bug in HW rule in Soundex > ------------------------- > > Key: CODEC-199 > URL: https://issues.apache.org/jira/browse/CODEC-199 > Project: Commons Codec > Issue Type: Bug > Affects Versions: 1.10 > Reporter: Yossi Tamari > Fix For: 1.11 > > Attachments: better.patch, soundex.patch > > > The Soundex algorithm says that if two characters that map to the same code > are separated by H or W, the second one is not encoded. > However, in the implementation (in Soundex.getMappingCode() line 191), a > character that is preceded by two characters that are either H or W, is not > encoded, regardless of what the last consonant was. > Source: http://en.wikipedia.org/wiki/Soundex#American_Soundex -- This message was sent by Atlassian JIRA (v6.3.4#6332)