Re: Smart Names implementation
Jake, You're getting into an area that has been wrestled with for years, and the true solutions are very, very complex. So your choices on implementation will depend on just far into the forest you want to go. Do a Google search on fuzzy name matching. You'll find links to academic white papers discussing types of algorithms that are used in some implementations. You'll also find a number of vendors selling specific software solutions to do this. Folks like the phone companies began working on this problem probably 30 years ago, for the directory software used by operators and others. I just looked into this subject for a client. In their case, we wrote some specific matching rules that fit their situation. We considered adding a nicknames table; but in their case, it seemed likely that most names in their database would have been entered as a person's full, formal name. We added soundex to the rules, and came up with our own ranking system for the matches. A word about SoundEx -- unless you keep the similarity factor set to return only very close matches, you'll get some incredibly bad attempts at matching. SoundEx would be more likely to catch typos, but would miss a lot of nickname variations. As I say, fuzzy name matching is a very large-scale software challenge that people have thrown millions of dollars at. You just have to determine what's practical for your needs. -- Thanks, Tom Tom McNeer MediumCool http://www.mediumcool.com 1735 Johnson Road NE Atlanta, GA 30306 404.589.0560 ~| Create Web Applications With ColdFusion MX7 Flex 2. Build powerful, scalable RIAs. Free Trial http://www.adobe.com/products/coldfusion/flex2/ Archive: http://www.houseoffusion.com/groups/CF-Talk/message.cfm/messageid:272008 Subscription: http://www.houseoffusion.com/groups/CF-Talk/subscribe.cfm Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=89.70.4
RE: Smart Names implementation
-Original Message- From: Tom McNeer [mailto:[EMAIL PROTECTED] Sent: Thursday, March 08, 2007 10:09 AM To: CF-Talk Subject: Re: Smart Names implementation Jake, As I say, fuzzy name matching is a very large-scale software challenge that people have thrown millions of dollars at. You just have to determine what's practical for your needs. For what it's worth we use Trillium - it runs on the mainframe and costs in the range of $250,000. And no - for all of that - it's not even close to perfect. ;^) (But it also does addresses and company names...) Jim Davis ~| ColdFusion MX7 by AdobeĀ® Dyncamically transform webcontent into Adobe PDF with new ColdFusion MX7. Free Trial. http://www.adobe.com/products/coldfusion Archive: http://www.houseoffusion.com/groups/CF-Talk/message.cfm/messageid:272098 Subscription: http://www.houseoffusion.com/groups/CF-Talk/subscribe.cfm Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=11502.10531.4
Re: Smart Names implementation
I guess I was thinking of something a bit more relational (2 tables: [id,name] [id,sameAsID]), but your example would do the trick :). However I could see this becoming an additional bottleneck in our system. We often have tables with tens or hundreds of thousands of records, and there are a number of checks that already are ran on each record to determine if a duplicate entry has been entered - we have seen this process take the bulk of a day with a large enough database. General overview: To initialize a project, a list of clients (people) is dumped into our database (usually from a spreadsheet). As the project goes on, we may add more entries to the database, delete entries (which is actually just an inactive flag), and modify entries to correct addresses, phone numbers, etc. There are a number of places that we may receive these updates from (phone, mail, etc) so there is a good chance that duplicates will be entered. While the system tries to eliminate duplicates on entry, it isn't able to catch all duplicates. Now that I think about it, maybe I'm looking at the wrong approach... There's always the potential for type-o's and I would like to catch those too (Robert vs Roebrt) -- obviously this would be beyond the capabilities of a smart name search (unless i started entering in potential type-o's into the smart names database - ick). Has anyone else dealt with this type of scenario? Thanks! Jake I've done something similar with part numbers. The hard part isn't doing lookups -- one table, three columns (id, name1, name2), two subselects -- the hard part is populating that table. I'm happy to show you how to do the lookup, but do you have a plan for populating the table? --Ben Doom Jake Pilgrim wrote: Has anyone implemented smart names (i.e. a search for Bob would match Bobby or Robert) functionality in Coldfusion? Anyone have any examples I can view? Thanks! Jake Pilgrim ~| Upgrade to Adobe ColdFusion MX7 Experience Flex 2 MX7 integration create powerful cross-platform RIAs http://www.adobe.com/products/coldfusion/flex2/ Archive: http://www.houseoffusion.com/groups/CF-Talk/message.cfm/messageid:271906 Subscription: http://www.houseoffusion.com/groups/CF-Talk/subscribe.cfm Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=89.70.4
Re: Smart Names implementation
On 3/7/07, Jake Pilgrim wrote: ... Now that I think about it, maybe I'm looking at the wrong approach... There's always the potential for type-o's and I would like to catch those too (Robert vs Roebrt) -- obviously this would be beyond the capabilities of a smart name search (unless i started entering in potential type-o's into the smart names database - ick). Soundex, is the direction I went, and I like it. It'll get the typos too, a good bit of the time. Yup, sounds like what your looking for! Hahahahaha. I rock. ~| ColdFusion MX7 by AdobeĀ® Dyncamically transform webcontent into Adobe PDF with new ColdFusion MX7. Free Trial. http://www.adobe.com/products/coldfusion Archive: http://www.houseoffusion.com/groups/CF-Talk/message.cfm/messageid:271955 Subscription: http://www.houseoffusion.com/groups/CF-Talk/subscribe.cfm Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=89.70.4
Re: Smart Names implementation
I've done something similar with part numbers. The hard part isn't doing lookups -- one table, three columns (id, name1, name2), two subselects -- the hard part is populating that table. I'm happy to show you how to do the lookup, but do you have a plan for populating the table? --Ben Doom Jake Pilgrim wrote: Has anyone implemented smart names (i.e. a search for Bob would match Bobby or Robert) functionality in Coldfusion? Anyone have any examples I can view? Thanks! Jake Pilgrim ~| Create Web Applications With ColdFusion MX7 Flex 2. Build powerful, scalable RIAs. Free Trial http://www.adobe.com/products/coldfusion/flex2/ Archive: http://www.houseoffusion.com/groups/CF-Talk/message.cfm/messageid:271746 Subscription: http://www.houseoffusion.com/groups/CF-Talk/subscribe.cfm Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=89.70.4