Re: [LANG] Algorithm for fuzzy string matching

2014-05-02 Thread Benedikt Ritter
Yes, that would be the plan, I guess :-)


2014-05-02 15:58 GMT+02:00 Gary Gregory :

> So, keep SU as a kitchen sink and refactor for 4.0? I'm OK with that.
>
> Gary
>
>
> On Fri, May 2, 2014 at 7:03 AM, Benedikt Ritter 
> wrote:
>
> > Hi Gary,
> >
> > we had a discussion about this some time ago, where I proposed to create
> a
> > new class (let's call it StringMetrics) and move Levenshtein and Jaro
> > Winkler to it. We decided not to do this in 3.x, since SU already has
> 180+
> > methods which will have to be split up in the next major release.
> >
> > Benedikt
> >
> >
> > 2014-05-02 13:00 GMT+02:00 Gary Gregory :
> >
> > > Do we really want this in SU or should it live in its own class?
> > >
> > > Gary
> > >
> > >  Original message From: Benedikt
> Ritter <
> > > brit...@apache.org> Date:05/02/2014  04:15  (GMT-05:00)
> > > To: Commons Developers List 
> > > Subject: Re: [LANG] Algorithm for fuzzy string matching
> > > 
> > > Since nobody had objections against adding this, I'll apply this
> > > patch.
> > >
> > > Benedikt
> > >
> > >
> > > 2014-04-28 17:47 GMT+02:00 Benedikt Ritter :
> > >
> > > > Hi all,
> > > >
> > > > we have a nice PR for StringUtils at github:
> > > > https://github.com/apache/commons-lang/pull/20
> > > >
> > > > It adds a new string matching algorithm to StringUtils, that
> > calculates a
> > > > score for the similarity between to strings. This kind of fuzzy
> > matching
> > > is
> > > > known from editors like Sublime Text, Text Mate or Atom.
> > > >
> > > > I think this is a very useful features, but as the contributor points
> > > out,
> > > > the is no scientific paper or thesis that provides a reference for
> the
> > > > implementation. So this is not _the one_ implementation of a fuzzy
> > string
> > > > matching score, like our implementations of the Levenshtein or
> > > Jaro-Winkler
> > > > algorithms.
> > > >
> > > > So before adding this, I'd like to hear how others feel about this
> > > feature.
> > > >
> > > > Regards,
> > > > Benedikt
> > > >
> > > >
> > > > --
> > > > http://people.apache.org/~britter/
> > > > http://www.systemoutprintln.de/
> > > > http://twitter.com/BenediktRitter
> > > > http://github.com/britter
> > > >
> > >
> > >
> > >
> > > --
> > > http://people.apache.org/~britter/
> > > http://www.systemoutprintln.de/
> > > http://twitter.com/BenediktRitter
> > > http://github.com/britter
> > >
> >
> >
> >
> > --
> > http://people.apache.org/~britter/
> > http://www.systemoutprintln.de/
> > http://twitter.com/BenediktRitter
> > http://github.com/britter
> >
>
>
>
> --
> E-Mail: garydgreg...@gmail.com | ggreg...@apache.org
> Java Persistence with Hibernate, Second Edition<
> http://www.manning.com/bauer3/>
> JUnit in Action, Second Edition <http://www.manning.com/tahchiev/>
> Spring Batch in Action <http://www.manning.com/templier/>
> Blog: http://garygregory.wordpress.com
> Home: http://garygregory.com/
> Tweet! http://twitter.com/GaryGregory
>



-- 
http://people.apache.org/~britter/
http://www.systemoutprintln.de/
http://twitter.com/BenediktRitter
http://github.com/britter


Re: [LANG] Algorithm for fuzzy string matching

2014-05-02 Thread Gary Gregory
So, keep SU as a kitchen sink and refactor for 4.0? I'm OK with that.

Gary


On Fri, May 2, 2014 at 7:03 AM, Benedikt Ritter  wrote:

> Hi Gary,
>
> we had a discussion about this some time ago, where I proposed to create a
> new class (let's call it StringMetrics) and move Levenshtein and Jaro
> Winkler to it. We decided not to do this in 3.x, since SU already has 180+
> methods which will have to be split up in the next major release.
>
> Benedikt
>
>
> 2014-05-02 13:00 GMT+02:00 Gary Gregory :
>
> > Do we really want this in SU or should it live in its own class?
> >
> > Gary
> >
> >  Original message From: Benedikt Ritter <
> > brit...@apache.org> Date:05/02/2014  04:15  (GMT-05:00)
> > To: Commons Developers List 
> > Subject: Re: [LANG] Algorithm for fuzzy string matching
> > 
> > Since nobody had objections against adding this, I'll apply this
> > patch.
> >
> > Benedikt
> >
> >
> > 2014-04-28 17:47 GMT+02:00 Benedikt Ritter :
> >
> > > Hi all,
> > >
> > > we have a nice PR for StringUtils at github:
> > > https://github.com/apache/commons-lang/pull/20
> > >
> > > It adds a new string matching algorithm to StringUtils, that
> calculates a
> > > score for the similarity between to strings. This kind of fuzzy
> matching
> > is
> > > known from editors like Sublime Text, Text Mate or Atom.
> > >
> > > I think this is a very useful features, but as the contributor points
> > out,
> > > the is no scientific paper or thesis that provides a reference for the
> > > implementation. So this is not _the one_ implementation of a fuzzy
> string
> > > matching score, like our implementations of the Levenshtein or
> > Jaro-Winkler
> > > algorithms.
> > >
> > > So before adding this, I'd like to hear how others feel about this
> > feature.
> > >
> > > Regards,
> > > Benedikt
> > >
> > >
> > > --
> > > http://people.apache.org/~britter/
> > > http://www.systemoutprintln.de/
> > > http://twitter.com/BenediktRitter
> > > http://github.com/britter
> > >
> >
> >
> >
> > --
> > http://people.apache.org/~britter/
> > http://www.systemoutprintln.de/
> > http://twitter.com/BenediktRitter
> > http://github.com/britter
> >
>
>
>
> --
> http://people.apache.org/~britter/
> http://www.systemoutprintln.de/
> http://twitter.com/BenediktRitter
> http://github.com/britter
>



-- 
E-Mail: garydgreg...@gmail.com | ggreg...@apache.org
Java Persistence with Hibernate, Second Edition<http://www.manning.com/bauer3/>
JUnit in Action, Second Edition <http://www.manning.com/tahchiev/>
Spring Batch in Action <http://www.manning.com/templier/>
Blog: http://garygregory.wordpress.com
Home: http://garygregory.com/
Tweet! http://twitter.com/GaryGregory


Re: [LANG] Algorithm for fuzzy string matching

2014-05-02 Thread Benedikt Ritter
Hi Gary,

we had a discussion about this some time ago, where I proposed to create a
new class (let's call it StringMetrics) and move Levenshtein and Jaro
Winkler to it. We decided not to do this in 3.x, since SU already has 180+
methods which will have to be split up in the next major release.

Benedikt


2014-05-02 13:00 GMT+02:00 Gary Gregory :

> Do we really want this in SU or should it live in its own class?
>
> Gary
>
>  Original message From: Benedikt Ritter <
> brit...@apache.org> Date:05/02/2014  04:15  (GMT-05:00)
> To: Commons Developers List 
> Subject: Re: [LANG] Algorithm for fuzzy string matching
> 
> Since nobody had objections against adding this, I'll apply this
> patch.
>
> Benedikt
>
>
> 2014-04-28 17:47 GMT+02:00 Benedikt Ritter :
>
> > Hi all,
> >
> > we have a nice PR for StringUtils at github:
> > https://github.com/apache/commons-lang/pull/20
> >
> > It adds a new string matching algorithm to StringUtils, that calculates a
> > score for the similarity between to strings. This kind of fuzzy matching
> is
> > known from editors like Sublime Text, Text Mate or Atom.
> >
> > I think this is a very useful features, but as the contributor points
> out,
> > the is no scientific paper or thesis that provides a reference for the
> > implementation. So this is not _the one_ implementation of a fuzzy string
> > matching score, like our implementations of the Levenshtein or
> Jaro-Winkler
> > algorithms.
> >
> > So before adding this, I'd like to hear how others feel about this
> feature.
> >
> > Regards,
> > Benedikt
> >
> >
> > --
> > http://people.apache.org/~britter/
> > http://www.systemoutprintln.de/
> > http://twitter.com/BenediktRitter
> > http://github.com/britter
> >
>
>
>
> --
> http://people.apache.org/~britter/
> http://www.systemoutprintln.de/
> http://twitter.com/BenediktRitter
> http://github.com/britter
>



-- 
http://people.apache.org/~britter/
http://www.systemoutprintln.de/
http://twitter.com/BenediktRitter
http://github.com/britter


Re: [LANG] Algorithm for fuzzy string matching

2014-05-02 Thread Gary Gregory
Do we really want this in SU or should it live in its own class?

Gary

 Original message From: Benedikt Ritter 
 Date:05/02/2014  04:15  (GMT-05:00) 
To: Commons Developers List  
Subject: Re: [LANG] Algorithm for fuzzy string matching 
Since nobody had objections against adding this, I'll apply this patch.

Benedikt


2014-04-28 17:47 GMT+02:00 Benedikt Ritter :

> Hi all,
>
> we have a nice PR for StringUtils at github:
> https://github.com/apache/commons-lang/pull/20
>
> It adds a new string matching algorithm to StringUtils, that calculates a
> score for the similarity between to strings. This kind of fuzzy matching is
> known from editors like Sublime Text, Text Mate or Atom.
>
> I think this is a very useful features, but as the contributor points out,
> the is no scientific paper or thesis that provides a reference for the
> implementation. So this is not _the one_ implementation of a fuzzy string
> matching score, like our implementations of the Levenshtein or Jaro-Winkler
> algorithms.
>
> So before adding this, I'd like to hear how others feel about this feature.
>
> Regards,
> Benedikt
>
>
> --
> http://people.apache.org/~britter/
> http://www.systemoutprintln.de/
> http://twitter.com/BenediktRitter
> http://github.com/britter
>



-- 
http://people.apache.org/~britter/
http://www.systemoutprintln.de/
http://twitter.com/BenediktRitter
http://github.com/britter


Re: [LANG] Algorithm for fuzzy string matching

2014-05-02 Thread Benedikt Ritter
Since nobody had objections against adding this, I'll apply this patch.

Benedikt


2014-04-28 17:47 GMT+02:00 Benedikt Ritter :

> Hi all,
>
> we have a nice PR for StringUtils at github:
> https://github.com/apache/commons-lang/pull/20
>
> It adds a new string matching algorithm to StringUtils, that calculates a
> score for the similarity between to strings. This kind of fuzzy matching is
> known from editors like Sublime Text, Text Mate or Atom.
>
> I think this is a very useful features, but as the contributor points out,
> the is no scientific paper or thesis that provides a reference for the
> implementation. So this is not _the one_ implementation of a fuzzy string
> matching score, like our implementations of the Levenshtein or Jaro-Winkler
> algorithms.
>
> So before adding this, I'd like to hear how others feel about this feature.
>
> Regards,
> Benedikt
>
>
> --
> http://people.apache.org/~britter/
> http://www.systemoutprintln.de/
> http://twitter.com/BenediktRitter
> http://github.com/britter
>



-- 
http://people.apache.org/~britter/
http://www.systemoutprintln.de/
http://twitter.com/BenediktRitter
http://github.com/britter