M. Steiger created LANG-1199:
--------------------------------
Summary: Incorrect implementation of
StringUtils.getJaroWinklerDistance()
Key: LANG-1199
URL: https://issues.apache.org/jira/browse/LANG-1199
Project: Commons Lang
Issue Type: Bug
Components: lang.*
Affects Versions: 3.4
Reporter: M. Steiger
The current implementation of StringUtils.getJaroWinklerDistance() does not
compute the correct result in some cases. See #LANG-944 for the initial code
contribution.
StringUtils.getJaroWinklerDistance("Haus Ingeborg", "Ingeborg Esser") == 0.0
This is due to the incorrect computation of common characters, which causes the
algorithm to exit prematurely.
In contrast, the implementation in Lucene gives ~0.63, which is about right.
JaroWinklerDistance d = new JaroWinklerDistance();
getDistance("Haus Ingeborg", "Ingeborg Esser");
See
https://lucene.apache.org/core/3_0_3/api/contrib-spellchecker/org/apache/lucene/search/spell/JaroWinklerDistance.html
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)