A couple of year ago I wrote a little utility that was doing just that. but it was in vba (excel). I had two list with first name / last name that I had to match, and I used a whole bunch of silly algorithm. like so.
 
1. look for exact matches
2. look for exact same surname, and try to identify a nickname (like robert in one list, and bob in the other)
3. exact same surname, and same firstname except for one typo (using different flavour of left and right)...
 
and so on. Each of these rules had a color coding, so when viewing the result you knew right away where your matches were. Once you've run it a few times, it becomes obvious that rules 1/2 are 100% accurate, and rule 3-5 still like 85+% accurate. I also remember building an index on the fly, which sped the whole thing considerably. Of course, with a DB, you wouldn't have to do that yourself. I remember that the next feature was going to be a keyboard-smart heuristic, whereby not only do you detect typo, but you also detect the more likely ones (like an a for an s).
 
I should build this thing again, usig coldfusion, but between work, learning new stuff, and open source projects, I'm afraid I have no time :-/

 
tof
 
On 6/28/06, Tom MacKean <[EMAIL PROTECTED]> wrote:
HI all,
 
I have two databases each containing names. I need to compare the two tables and identify anyone who its in both lists. There are 4000 names in one list and 1200 in the other.
 
Speed is not an issue - this is not a public site.
 
The only thing I can think of is to loop through one table and on each record, loop through the other table to find matches.
 
I would love to hear any thoughts about different ways of doing this.
 
Thanks,
 
Tom

--
IMPORTANT: This email is intended for the use of the individual addressee(s) named above and may contain information that is confidential privileged or unsuitable for overly sensitive persons with low self-esteem, no sense of humor or irrational religious beliefs. If you are not the intended recipient, any dissemination, distribution or copying of this email is not authorized (either explicitly or implicitly) and constitutes an irritating social fauxpas. No animals were harmed in the transmission of this email, although the mutt next door is living on borrowed time, let me tell you.





--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups "cfaussie" group.
To post to this group, send email to cfaussie@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at http://groups.google.com/group/cfaussie
-~----------~----~----~----~------~----~------~--~---

Reply via email to