[jira] [Commented] (LANG-1011) Create a new class StringDistance as host for the getXXDistance methods in StringUtils

2015-03-01 Thread Benedikt Ritter (JIRA)

[ 
https://issues.apache.org/jira/browse/LANG-1011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14342179#comment-14342179
 ] 

Benedikt Ritter commented on LANG-1011:
---

Hello Jonathan,

actually this ticket is pretty much outdated. We have created a new component 
that is focused on text processing algorithms: 
http://commons.apache.org/sandbox/commons-text/

If you're interested in this area, you should start there :-)

regards,
Benedikt

> Create a new class StringDistance as host for the getXXDistance methods in 
> StringUtils
> --
>
> Key: LANG-1011
> URL: https://issues.apache.org/jira/browse/LANG-1011
> Project: Commons Lang
>  Issue Type: New Feature
>  Components: lang.*
>Reporter: Benedikt Ritter
>Assignee: Benedikt Ritter
> Fix For: 3.4
>
> Attachments: StringDistanceTest4.java, StringDistanceTest4Pre8.java, 
> StringDistanceTest5.java
>
>
> We're getting more and more algorithms that calculate distances between 
> strings, so it makes sense to create a new class for this kind of logic.
> deprecate getLevenshteinDistance and getJaroWinklerDistance and delegate to 
> the new class. If the new class is implemented in 3.4, move getFuzzyDistance 
> (is has not yet been released)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (LANG-1011) Create a new class StringDistance as host for the getXXDistance methods in StringUtils

2015-02-28 Thread Jonathan Baker (JIRA)

[ 
https://issues.apache.org/jira/browse/LANG-1011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14341685#comment-14341685
 ] 

Jonathan Baker commented on LANG-1011:
--

Please ignore #3.  My Java 8 ignorance is showing.
I should have taken the time to do a test *before* posting.  Sorry!

> Create a new class StringDistance as host for the getXXDistance methods in 
> StringUtils
> --
>
> Key: LANG-1011
> URL: https://issues.apache.org/jira/browse/LANG-1011
> Project: Commons Lang
>  Issue Type: New Feature
>  Components: lang.*
>Reporter: Benedikt Ritter
>Assignee: Benedikt Ritter
> Fix For: 3.4
>
>
> We're getting more and more algorithms that calculate distances between 
> strings, so it makes sense to create a new class for this kind of logic.
> deprecate getLevenshteinDistance and getJaroWinklerDistance and delegate to 
> the new class. If the new class is implemented in 3.4, move getFuzzyDistance 
> (is has not yet been released)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (LANG-1011) Create a new class StringDistance as host for the getXXDistance methods in StringUtils

2015-02-28 Thread Jonathan Baker (JIRA)

[ 
https://issues.apache.org/jira/browse/LANG-1011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14341562#comment-14341562
 ] 

Jonathan Baker commented on LANG-1011:
--

1. Is org.apache.commons.lang3.text.StringDistances a good place to move these 
functions?

2. Should the corresponding changes also be made in the 2.x version?  The 
[release 
plan](https://issues.apache.org/jira/browse/LANG?selectedTab=com.atlassian.jira.jira-projects-plugin:summary-panel#pd-desc)
 says no, but please confirm.

3. Would it make sense (maybe for lang 4 since java 8 is required) to create a 
StringDistance interface that extends [BiFunction](http://docs.oracle.com/javase/8/docs/api/java/util/function/BiFunction.html)?

// For example:

public interface StringDistance extends BiFunction {

public DISTANCE apply( CharSequence t, CharSequence u );

}

public class LevenshteinDistance implements StringDistance {

private final Integer threshold;

public LeveshteinDistance() { ... }

public LevenshteinDistance( final int threshold ) { ... }

public Integer apply( CharSequence t, CharSequence u ) {
// Would two Leveshtein classes be better than the null check?
if (threshold == null) {
return getDistance( t, u );
} else {
return getDistance( t, u, threshold );
}
}

public static Integer getDistance( CharSequence t, CharSequence u ) { 
... }

public static Integer getDistance( CharSequence t, CharSequence u, int 
threshold ) { ... }

}

> Create a new class StringDistance as host for the getXXDistance methods in 
> StringUtils
> --
>
> Key: LANG-1011
> URL: https://issues.apache.org/jira/browse/LANG-1011
> Project: Commons Lang
>  Issue Type: New Feature
>  Components: lang.*
>Reporter: Benedikt Ritter
>Assignee: Benedikt Ritter
> Fix For: 3.4
>
>
> We're getting more and more algorithms that calculate distances between 
> strings, so it makes sense to create a new class for this kind of logic.
> deprecate getLevenshteinDistance and getJaroWinklerDistance and delegate to 
> the new class. If the new class is implemented in 3.4, move getFuzzyDistance 
> (is has not yet been released)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)