[
https://issues.apache.org/jira/browse/CODEC-249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18058216#comment-18058216
]
Gary D. Gregory commented on CODEC-249:
---------------------------------------
[~arturobernalg]
Your use cases compute a 5-character code; by default, results are truncated to
4 characters. You need to use {{setMaxCodeLen()}} to get longer results.
> Incorrect transform of CH digraph according Metaphone basic rules
> -----------------------------------------------------------------
>
> Key: CODEC-249
> URL: https://issues.apache.org/jira/browse/CODEC-249
> Project: Commons Codec
> Issue Type: Bug
> Reporter: Andrey
> Priority: Major
> Fix For: 1.22.0
>
>
> I detected incorrect transform of CH digraph by metaphone algorithm.
> According _Philips_ _Lawrence_ CH should be transformed to 'X':
> {code:java}
> 'C' transforms to 'X' if followed by 'IA' or 'H' (unless in latter case, it
> is part of '-SCH-', in which case it transforms to 'K'). 'C' transforms to
> 'S' if followed by 'I', 'E', or 'Y'. Otherwise, 'C' transforms to 'K'.
> {code}
> But in Apache realization I see
> {code:java}
> if (isNextChar(local, n, 'H')) { // detect CH
> if (n == 0 &&
> wdsz >= 3 &&
> isVowel(local,2) ) { // CH consonant -> K
> consonant
> code.append('K');
> } else {
> code.append('X'); // CHvowel -> X
> }
> {code}
> So after transformation I get 'K' instead of 'X'
> *Example*: CHERI should be transformed to 'XR' but I get 'KR' which is wrong
> This bug has major priority due to big impact on results of metaphone
> algorithm
--
This message was sent by Atlassian Jira
(v8.20.10#820010)