Fuzzy matching...now I really won't be able to sleep tonight. I was pretty obsessive for some time. There are a bunch of good papers about using n-gram tables (2, 3, or 4-grams) in SQL databases to perform highly optimized comparisons. It takes that little bit of extra setup, but then you can get solid coverage of huge data sets with excellent performance. I love Levenshtein, but n-grams are better at detecting names entered out of order, like "Adam David" instead of "David Adams." LCS and Levenshtein won't find this as a match, but even a 3-gram will flag it.
Let me know if you're interested and I'll dig up papers or references. ********************************************************************** 4D Internet Users Group (4D iNUG) FAQ: http://lists.4d.com/faqnug.html Archive: http://lists.4d.com/archives.html Options: http://lists.4d.com/mailman/options/4d_tech Unsub: mailto:4d_tech-unsubscr...@lists.4d.com **********************************************************************