[jira] [Updated] (HIVE-9556) create UDF to measure strings similarity using Levenshtein Distance algo

2015-02-07 Thread Alexander Pivovarov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9556?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Pivovarov updated HIVE-9556:
--
Description: 
algorithm description http://en.wikipedia.org/wiki/Levenshtein_distance
{code}
--one edit operation, greatest str len = 12
str_sim_levenshtein('Test String1', 'Test String2') = 1 - 1 / 12 = 0.9167
{code}

  was:
algorithm description http://en.wikipedia.org/wiki/Levenshtein_distance
{code}
--one edit operation, greatest str len = 12
str_sim_levenshtein('Test String1', 'Test String2') = (12 -1) / 12 = 0.917f
{code}


 create UDF to measure strings similarity using Levenshtein Distance algo
 

 Key: HIVE-9556
 URL: https://issues.apache.org/jira/browse/HIVE-9556
 Project: Hive
  Issue Type: Improvement
  Components: UDF
Reporter: Alexander Pivovarov
Assignee: Alexander Pivovarov

 algorithm description http://en.wikipedia.org/wiki/Levenshtein_distance
 {code}
 --one edit operation, greatest str len = 12
 str_sim_levenshtein('Test String1', 'Test String2') = 1 - 1 / 12 = 0.9167
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9556) create UDF to measure strings similarity using Levenshtein Distance algo

2015-02-07 Thread Alexander Pivovarov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9556?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Pivovarov updated HIVE-9556:
--
Status: Patch Available  (was: In Progress)

 create UDF to measure strings similarity using Levenshtein Distance algo
 

 Key: HIVE-9556
 URL: https://issues.apache.org/jira/browse/HIVE-9556
 Project: Hive
  Issue Type: Improvement
  Components: UDF
Reporter: Alexander Pivovarov
Assignee: Alexander Pivovarov
 Attachments: HIVE-9556.1.patch


 algorithm description http://en.wikipedia.org/wiki/Levenshtein_distance
 {code}
 --one edit operation, greatest str len = 12
 str_sim_levenshtein('Test String1', 'Test String2') = 1 - 1 / 12 = 0.9167
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9556) create UDF to measure strings similarity using Levenshtein Distance algo

2015-02-07 Thread Alexander Pivovarov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9556?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Pivovarov updated HIVE-9556:
--
Attachment: HIVE-9556.1.patch

patch #1

 create UDF to measure strings similarity using Levenshtein Distance algo
 

 Key: HIVE-9556
 URL: https://issues.apache.org/jira/browse/HIVE-9556
 Project: Hive
  Issue Type: Improvement
  Components: UDF
Reporter: Alexander Pivovarov
Assignee: Alexander Pivovarov
 Attachments: HIVE-9556.1.patch


 algorithm description http://en.wikipedia.org/wiki/Levenshtein_distance
 {code}
 --one edit operation, greatest str len = 12
 str_sim_levenshtein('Test String1', 'Test String2') = 1 - 1 / 12 = 0.9167
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9556) create UDF to measure strings similarity using Levenshtein Distance algo

2015-02-07 Thread Alexander Pivovarov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9556?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Pivovarov updated HIVE-9556:
--
Status: In Progress  (was: Patch Available)

 create UDF to measure strings similarity using Levenshtein Distance algo
 

 Key: HIVE-9556
 URL: https://issues.apache.org/jira/browse/HIVE-9556
 Project: Hive
  Issue Type: Improvement
  Components: UDF
Reporter: Alexander Pivovarov
Assignee: Alexander Pivovarov
 Attachments: HIVE-9556.1.patch


 algorithm description http://en.wikipedia.org/wiki/Levenshtein_distance
 {code}
 --one edit operation, greatest str len = 12
 str_sim_levenshtein('Test String1', 'Test String2') = 1 - 1 / 12 = 0.9167
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9556) create UDF to measure strings similarity using Levenshtein Distance algo

2015-02-07 Thread Alexander Pivovarov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9556?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Pivovarov updated HIVE-9556:
--
Status: Patch Available  (was: In Progress)

 create UDF to measure strings similarity using Levenshtein Distance algo
 

 Key: HIVE-9556
 URL: https://issues.apache.org/jira/browse/HIVE-9556
 Project: Hive
  Issue Type: Improvement
  Components: UDF
Reporter: Alexander Pivovarov
Assignee: Alexander Pivovarov
 Attachments: HIVE-9556.1.patch, HIVE-9556.2.patch


 algorithm description http://en.wikipedia.org/wiki/Levenshtein_distance
 {code}
 --one edit operation, greatest str len = 12
 str_sim_levenshtein('Test String1', 'Test String2') = 1 - 1 / 12 = 0.9167
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9556) create UDF to measure strings similarity using Levenshtein Distance algo

2015-02-07 Thread Alexander Pivovarov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9556?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Pivovarov updated HIVE-9556:
--
Attachment: HIVE-9556.2.patch

patch #2 - fixed checkArgsLength method

 create UDF to measure strings similarity using Levenshtein Distance algo
 

 Key: HIVE-9556
 URL: https://issues.apache.org/jira/browse/HIVE-9556
 Project: Hive
  Issue Type: Improvement
  Components: UDF
Reporter: Alexander Pivovarov
Assignee: Alexander Pivovarov
 Attachments: HIVE-9556.1.patch, HIVE-9556.2.patch


 algorithm description http://en.wikipedia.org/wiki/Levenshtein_distance
 {code}
 --one edit operation, greatest str len = 12
 str_sim_levenshtein('Test String1', 'Test String2') = 1 - 1 / 12 = 0.9167
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9556) create UDF to measure strings similarity using Levenshtein Distance algo

2015-02-02 Thread Alexander Pivovarov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9556?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Pivovarov updated HIVE-9556:
--
Summary: create UDF to measure strings similarity using Levenshtein 
Distance algo  (was: create UDF to measure string similarity using Levenshtein 
Distance algo)

 create UDF to measure strings similarity using Levenshtein Distance algo
 

 Key: HIVE-9556
 URL: https://issues.apache.org/jira/browse/HIVE-9556
 Project: Hive
  Issue Type: Improvement
  Components: UDF
Reporter: Alexander Pivovarov
Assignee: Alexander Pivovarov

 algorithm description http://en.wikipedia.org/wiki/Levenshtein_distance
 {code}
 --one edit operation, greatest str len = 12
 str_sim_levenshtein(Test String1, Test String2) = (12 -1) / 12 = 
 0.917f
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9556) create UDF to measure strings similarity using Levenshtein Distance algo

2015-02-02 Thread Alexander Pivovarov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9556?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Pivovarov updated HIVE-9556:
--
Description: 
algorithm description http://en.wikipedia.org/wiki/Levenshtein_distance
{code}
--one edit operation, greatest str len = 12
str_sim_levenshtein('Test String1', 'Test String2') = (12 -1) / 12 = 0.917f
{code}

  was:
algorithm description http://en.wikipedia.org/wiki/Levenshtein_distance
{code}
--one edit operation, greatest str len = 12
str_sim_levenshtein(Test String1, Test String2) = (12 -1) / 12 = 0.917f
{code}


 create UDF to measure strings similarity using Levenshtein Distance algo
 

 Key: HIVE-9556
 URL: https://issues.apache.org/jira/browse/HIVE-9556
 Project: Hive
  Issue Type: Improvement
  Components: UDF
Reporter: Alexander Pivovarov
Assignee: Alexander Pivovarov

 algorithm description http://en.wikipedia.org/wiki/Levenshtein_distance
 {code}
 --one edit operation, greatest str len = 12
 str_sim_levenshtein('Test String1', 'Test String2') = (12 -1) / 12 = 
 0.917f
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)