[jira] [Commented] (LANG-1011) Create a new class StringDistance as host for the getXXDistance methods in StringUtils
[ https://issues.apache.org/jira/browse/LANG-1011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14342179#comment-14342179 ] Benedikt Ritter commented on LANG-1011: --- Hello Jonathan, actually this ticket is pretty much outdated. We have created a new component that is focused on text processing algorithms: http://commons.apache.org/sandbox/commons-text/ If you're interested in this area, you should start there :-) regards, Benedikt > Create a new class StringDistance as host for the getXXDistance methods in > StringUtils > -- > > Key: LANG-1011 > URL: https://issues.apache.org/jira/browse/LANG-1011 > Project: Commons Lang > Issue Type: New Feature > Components: lang.* >Reporter: Benedikt Ritter >Assignee: Benedikt Ritter > Fix For: 3.4 > > Attachments: StringDistanceTest4.java, StringDistanceTest4Pre8.java, > StringDistanceTest5.java > > > We're getting more and more algorithms that calculate distances between > strings, so it makes sense to create a new class for this kind of logic. > deprecate getLevenshteinDistance and getJaroWinklerDistance and delegate to > the new class. If the new class is implemented in 3.4, move getFuzzyDistance > (is has not yet been released) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (LANG-1011) Create a new class StringDistance as host for the getXXDistance methods in StringUtils
[ https://issues.apache.org/jira/browse/LANG-1011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14341685#comment-14341685 ] Jonathan Baker commented on LANG-1011: -- Please ignore #3. My Java 8 ignorance is showing. I should have taken the time to do a test *before* posting. Sorry! > Create a new class StringDistance as host for the getXXDistance methods in > StringUtils > -- > > Key: LANG-1011 > URL: https://issues.apache.org/jira/browse/LANG-1011 > Project: Commons Lang > Issue Type: New Feature > Components: lang.* >Reporter: Benedikt Ritter >Assignee: Benedikt Ritter > Fix For: 3.4 > > > We're getting more and more algorithms that calculate distances between > strings, so it makes sense to create a new class for this kind of logic. > deprecate getLevenshteinDistance and getJaroWinklerDistance and delegate to > the new class. If the new class is implemented in 3.4, move getFuzzyDistance > (is has not yet been released) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (LANG-1011) Create a new class StringDistance as host for the getXXDistance methods in StringUtils
[ https://issues.apache.org/jira/browse/LANG-1011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14341562#comment-14341562 ] Jonathan Baker commented on LANG-1011: -- 1. Is org.apache.commons.lang3.text.StringDistances a good place to move these functions? 2. Should the corresponding changes also be made in the 2.x version? The [release plan](https://issues.apache.org/jira/browse/LANG?selectedTab=com.atlassian.jira.jira-projects-plugin:summary-panel#pd-desc) says no, but please confirm. 3. Would it make sense (maybe for lang 4 since java 8 is required) to create a StringDistance interface that extends [BiFunction](http://docs.oracle.com/javase/8/docs/api/java/util/function/BiFunction.html)? // For example: public interface StringDistance extends BiFunction { public DISTANCE apply( CharSequence t, CharSequence u ); } public class LevenshteinDistance implements StringDistance { private final Integer threshold; public LeveshteinDistance() { ... } public LevenshteinDistance( final int threshold ) { ... } public Integer apply( CharSequence t, CharSequence u ) { // Would two Leveshtein classes be better than the null check? if (threshold == null) { return getDistance( t, u ); } else { return getDistance( t, u, threshold ); } } public static Integer getDistance( CharSequence t, CharSequence u ) { ... } public static Integer getDistance( CharSequence t, CharSequence u, int threshold ) { ... } } > Create a new class StringDistance as host for the getXXDistance methods in > StringUtils > -- > > Key: LANG-1011 > URL: https://issues.apache.org/jira/browse/LANG-1011 > Project: Commons Lang > Issue Type: New Feature > Components: lang.* >Reporter: Benedikt Ritter >Assignee: Benedikt Ritter > Fix For: 3.4 > > > We're getting more and more algorithms that calculate distances between > strings, so it makes sense to create a new class for this kind of logic. > deprecate getLevenshteinDistance and getJaroWinklerDistance and delegate to > the new class. If the new class is implemented in 3.4, move getFuzzyDistance > (is has not yet been released) -- This message was sent by Atlassian JIRA (v6.3.4#6332)