Re: Smart Names implementation

2007-03-08 Thread Tom McNeer
Jake,

You're getting into an area that has been wrestled with for years, and the
true solutions are very, very complex. So your choices on implementation
will depend on just far into the forest you want to go.

Do a Google search on fuzzy name matching. You'll find links to academic
white papers discussing types of algorithms that are used in some
implementations. You'll also find a number of vendors selling specific
software solutions to do this.

Folks like the phone companies began working on this problem probably 30
years ago, for the directory software used by operators and others.

I just looked into this subject for a client. In their case, we wrote some
specific matching rules that fit their situation. We considered adding a
nicknames table; but in their case, it seemed likely that most names in
their database would have been entered as a person's full, formal name. We
added soundex to the rules, and came up with our own ranking system for the
matches.

A word about SoundEx -- unless you keep the similarity factor set to return
only very close matches, you'll get some incredibly bad attempts at
matching.

SoundEx would be more likely to catch typos, but would miss a lot of
nickname variations.

As I say, fuzzy name matching is a very large-scale software challenge that
people have thrown millions of dollars at. You just have to determine what's
practical for your needs.


-- 
Thanks,

Tom

Tom McNeer
MediumCool
http://www.mediumcool.com
1735 Johnson Road NE
Atlanta, GA 30306
404.589.0560


~|
Create Web Applications With ColdFusion MX7  Flex 2. 
Build powerful, scalable RIAs. Free Trial
http://www.adobe.com/products/coldfusion/flex2/

Archive: 
http://www.houseoffusion.com/groups/CF-Talk/message.cfm/messageid:272008
Subscription: http://www.houseoffusion.com/groups/CF-Talk/subscribe.cfm
Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=89.70.4


RE: Smart Names implementation

2007-03-08 Thread Jim Davis
 -Original Message-
 From: Tom McNeer [mailto:[EMAIL PROTECTED]
 Sent: Thursday, March 08, 2007 10:09 AM
 To: CF-Talk
 Subject: Re: Smart Names implementation
 
 Jake,
 
 As I say, fuzzy name matching is a very large-scale software challenge
 that
 people have thrown millions of dollars at. You just have to determine
 what's
 practical for your needs.

For what it's worth we use Trillium - it runs on the mainframe and costs
in the range of $250,000.

And no - for all of that - it's not even close to perfect.  ;^)

(But it also does addresses and company names...)

Jim Davis


~|
ColdFusion MX7 by AdobeĀ®
Dyncamically transform webcontent into Adobe PDF with new ColdFusion MX7. 
Free Trial. http://www.adobe.com/products/coldfusion

Archive: 
http://www.houseoffusion.com/groups/CF-Talk/message.cfm/messageid:272098
Subscription: http://www.houseoffusion.com/groups/CF-Talk/subscribe.cfm
Unsubscribe: 
http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=11502.10531.4


Re: Smart Names implementation

2007-03-07 Thread Jake Pilgrim
I guess I was thinking of something a bit more relational (2 tables: [id,name] 
[id,sameAsID]), but your example would do the trick :). 

However I could see this becoming an additional bottleneck in our system. We 
often have tables with tens or hundreds of thousands of records, and there are 
a number of checks that already are ran on each record to determine if a 
duplicate entry has been entered - we have seen this process take the bulk of a 
day with a large enough database. 

General overview: To initialize a project, a list of clients (people) is dumped 
into our database (usually from a spreadsheet). As the project goes on, we may 
add more entries to the database, delete entries (which is actually just an 
inactive flag), and modify entries to correct addresses, phone numbers, etc. 
There are a number of places that we may receive these updates from (phone, 
mail, etc) so there is a good chance that duplicates will be entered. While the 
system tries to eliminate duplicates on entry, it isn't able to catch all 
duplicates. 

Now that I think about it, maybe I'm looking at the wrong approach... There's 
always the potential for type-o's and I would like to catch those too (Robert 
vs Roebrt) -- obviously this would be beyond the capabilities of a smart name 
search (unless i started entering in potential type-o's into the smart names 
database - ick). 

Has anyone else dealt with this type of scenario?

Thanks!
Jake

 I've done something similar with part numbers.  The hard part isn't 
 doing lookups -- one table, three columns (id, name1, name2), two 
 subselects -- the hard part is populating that table.
 
 I'm happy to show you how to do the lookup, but do you have a plan for 
 
 populating the table?
 
 --Ben Doom
 
 Jake Pilgrim wrote:
  Has anyone implemented smart names (i.e. a search for Bob would 
 match Bobby or Robert) functionality in Coldfusion? Anyone have 
 any examples I can view?
  
  Thanks!
  Jake Pilgrim
  
  

~|
Upgrade to Adobe ColdFusion MX7
Experience Flex 2  MX7 integration  create powerful cross-platform RIAs
http://www.adobe.com/products/coldfusion/flex2/

Archive: 
http://www.houseoffusion.com/groups/CF-Talk/message.cfm/messageid:271906
Subscription: http://www.houseoffusion.com/groups/CF-Talk/subscribe.cfm
Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=89.70.4


Re: Smart Names implementation

2007-03-07 Thread Dinner
On 3/7/07, Jake Pilgrim wrote:

 ...
 Now that I think about it, maybe I'm looking at the wrong approach...
 There's always the potential for type-o's and I would like to catch those
 too (Robert vs Roebrt) -- obviously this would be beyond the
 capabilities of a smart name search (unless i started entering in potential
 type-o's into the smart names database - ick).


Soundex, is the direction I went, and  I like it.

It'll get the typos too, a good bit of the time.

Yup, sounds like what your looking for!

Hahahahaha.  I rock.


~|
ColdFusion MX7 by AdobeĀ®
Dyncamically transform webcontent into Adobe PDF with new ColdFusion MX7. 
Free Trial. http://www.adobe.com/products/coldfusion

Archive: 
http://www.houseoffusion.com/groups/CF-Talk/message.cfm/messageid:271955
Subscription: http://www.houseoffusion.com/groups/CF-Talk/subscribe.cfm
Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=89.70.4


Smart Names implementation

2007-03-06 Thread Jake Pilgrim
Has anyone implemented smart names (i.e. a search for Bob would match 
Bobby or Robert) functionality in Coldfusion? Anyone have any examples I 
can view?

Thanks!
Jake Pilgrim

~|
Deploy Web Applications Quickly across the enterprise with ColdFusion MX7  
Flex 2. 
Free Trial 
http://www.adobe.com/products/coldfusion/flex2/

Archive: 
http://www.houseoffusion.com/groups/CF-Talk/message.cfm/messageid:271702
Subscription: http://www.houseoffusion.com/groups/CF-Talk/subscribe.cfm
Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=89.70.4


Re: Smart Names implementation

2007-03-06 Thread Ben Doom
I've done something similar with part numbers.  The hard part isn't 
doing lookups -- one table, three columns (id, name1, name2), two 
subselects -- the hard part is populating that table.

I'm happy to show you how to do the lookup, but do you have a plan for 
populating the table?

--Ben Doom

Jake Pilgrim wrote:
 Has anyone implemented smart names (i.e. a search for Bob would match 
 Bobby or Robert) functionality in Coldfusion? Anyone have any examples I 
 can view?
 
 Thanks!
 Jake Pilgrim
 
 

~|
Create Web Applications With ColdFusion MX7  Flex 2. 
Build powerful, scalable RIAs. Free Trial
http://www.adobe.com/products/coldfusion/flex2/

Archive: 
http://www.houseoffusion.com/groups/CF-Talk/message.cfm/messageid:271746
Subscription: http://www.houseoffusion.com/groups/CF-Talk/subscribe.cfm
Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=89.70.4