On Fri, Jun 26, 2009 at 10:00 PM, Jean-Christophe Deschamps<j...@q-e-d.org> wrote: > Hi, > > I'm currently finishing an C extension offering, among other functions, > a "TYPOS" scalar operator which is meant to perform just that, and a > bit more. > > Internally, it applies a Unicode fold() function, a Unicode lower() > function and then computes the Damerau-Levenshtein distance between the > strings. It returns the number of insertions, omissions, change and > transposition (of adjacent letters only). > > If the reference string is 'abcdef', it will return 1 (one typo) for > 'abdef' missing c > 'abcudef' u inserted > 'abzef' c changed into z > 'abdcef' c & d exchanged > > It will also accept a trailing '%' in string2 acting as in LIKE. > > You can use it this way: > > select * from t where typos(col, 'levencht%') <= 2; > > or this way > > select typos(str1, str2) > > The code currently makes use of a couple of Win32 functions, which > should have Un*x equivalent. It runs at really decent speed even if I > didn't fight for optimization. It will obviously outperform any SQL > solution by a large factor. > > I can't promise a very clean version tomorrow but just mail if you're > interested in the C source. You could tailor it to your precise needs > easily.
I can't help and test it in the next few days. But I would be happy to test and give some results about it Cheers -- Alberto Simões _______________________________________________ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users