RE: string-likeness

2013-06-06 Thread hsv
2013/06/03 21:43 +, Rick James Soundex is the 'right' approach, but it needs improvement. So, find an improvement, then do something like this... Hashing involves somekind normalizing, and in my case I see no means to it; otherwise I would not have considered something so

Re: string-likeness

2013-06-04 Thread hsv
2013/06/03 18:38 +0200, Hartmut Holzgraefe equality checks have a linear cost of O(min(len1,len2)) and can make use of indexes, too, while Levenshtein cost is is almost quadratic O(len1*len2) and can't make any good use of indexes ... even using a C UDF would help only so far with this ki

Re: string-likeness

2013-06-03 Thread Michael Dykman
I will second Rick's approach and have implemented something very similar for a client when soundex feel short of expectation. It worked very well. On Mon, Jun 3, 2013 at 5:43 PM, Rick James wrote: > Soundex is the 'right' approach, but it needs improvement. So, find an > improvement, then do

RE: string-likeness

2013-06-03 Thread Rick James
Soundex is the 'right' approach, but it needs improvement. So, find an improvement, then do something like this... Store the Soundex value in a column of its own, INDEX that column, and JOIN on that column using "=". Thus, ... * You have spent the effort to convert to Soundex once, not on every

Re: string-likeness

2013-06-03 Thread Hartmut Holzgraefe
On 03.06.2013 17:29, h...@tbbs.net wrote: > I wish to join two tables on likeness, not equality, of character strings. > Soundex does not work. I am using the Levenstein edit distance, written in > SQL, a very costly test, and I am in no position to write it in C and link it > to MySQL--and join

Re: string-likeness

2013-06-03 Thread Johan De Meersman
- Original Message - > From: h...@tbbs.net > > I wish to join two tables on likeness, not equality, of character > strings. Soundex does not work. I am using the Levenstein edit > distance, written in SQL, a very costly test, and I am in no > position to write it in C and link it to MySQL-