[ https://issues.apache.org/jira/browse/IMPALA-8752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16896106#comment-16896106 ]
Norbert Luksa commented on IMPALA-8752: --------------------------------------- https://gerrit.cloudera.org/#/c/13870/ > Add Jaro-winkler edit distance and similarity built-in function > --------------------------------------------------------------- > > Key: IMPALA-8752 > URL: https://issues.apache.org/jira/browse/IMPALA-8752 > Project: IMPALA > Issue Type: New Feature > Reporter: Norbert Luksa > Assignee: Norbert Luksa > Priority: Major > Labels: built-in-function > > References: > * [Apache commons - JaroWinklerDistance > |[https://commons.apache.org/proper/commons-text/apidocs/org/apache/commons/text/similarity/JaroWinklerDistance.html]] > * [Apache commons - JaroWinklerSimilarity > |[https://commons.apache.org/proper/commons-text/apidocs/org/apache/commons/text/similarity/JaroWinklerSimilarity.html]] > * [Oracle - > JARO_WINKLER[_SIMILARITY]|[https://oracle-base.com/articles/11g/utl_match-string-matching-in-oracle]] > Notable difference: > * With similarity, the Oracle version returns a normalized result ranging > from 0 to 100. > * In the Appache version, null values result in exceptions. > * Apache rounds the values to two digitsĀ > The scaling factor of the algorithm can be added as an extra/default argument. -- This message was sent by Atlassian JIRA (v7.6.14#76016) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org