I have a dataset that I've put together from a number of client files. To this
point I've been able to easily build a set of ColdFusion tools for using the
data but there is a de-duping process that I need to do that I just don't now
how to approach.
The data has a series of first and last
I did some poking around one day for stuff like this and came across
an algorithm called Soundex that helps you know if two names are the
same, even though they might have slightly different spelling. I just
did a search, and found that Ben Forta wrote a UDF for doing this.
Not sure if it will
an algorithm called Soundex
Yeah, but soundex is not a panacea either. All what matters with soundex
is the first syllable;
after that, about anything will match.
It might detect Martine and Matinez as being the same, but Martin,
Martinovitch and Martinelli as well.
So anyway, some human
.
Visit our website at http://www.reedexpo.com
-Original Message-
From: Matthew Reinbold
To: CF-Talk
Sent: Sun Jan 14 16:28:16 2007
Subject: Detecting (Almost) Matches for DeDuping?
I have a dataset that I've put together from a number of client files. To
this point I've been able to easily
Thanks for all the quick responses.
SoundEx is interesting but it only finds names that sound the same - like
Johnson and Jonson. However, if a misspelling causes the two names to be
phonetically different - like Johnson and Jihnson I don't believe it will find
that match.
I agree, if there's
To: CF-Talk
Sent: Sun Jan 14 17:31:51 2007
Subject: Re: Detecting (Almost) Matches for DeDuping?
Thanks for all the quick responses.
SoundEx is interesting but it only finds names that sound the same - like
Johnson and Jonson. However, if a misspelling causes the two names to be
phonetically different
6 matches
Mail list logo