On Fri, Jun 26, 2009 at 10:00 PM, Jean-Christophe
Deschamps<j...@q-e-d.org> wrote:
> Hi,
>
> I'm currently finishing an C extension offering, among other functions,
> a "TYPOS" scalar operator which is meant to perform just that, and a
> bit more.
>
> Internally, it applies a Unicode fold() function, a Unicode lower()
> function and then computes the Damerau-Levenshtein distance between the
> strings.  It returns the number of insertions, omissions, change and
> transposition (of adjacent letters only).
>
> If the reference string is 'abcdef', it will return 1 (one typo) for
> 'abdef'     missing c
> 'abcudef'   u inserted
> 'abzef'     c changed into z
> 'abdcef'    c & d exchanged
>
> It will also accept a trailing '%' in string2 acting as in LIKE.
>
> You can use it this way:
>
>   select * from t where typos(col, 'levencht%') <= 2;
>
> or this way
>
>   select typos(str1, str2)
>
> The code currently makes use of a couple of Win32 functions, which
> should have Un*x equivalent.  It runs at really decent speed even if I
> didn't fight for optimization.  It will obviously outperform any SQL
> solution by a large factor.
>
> I can't promise a very clean version tomorrow but just mail if you're
> interested in the C source. You could tailor it to your precise needs
> easily.

I can't help and test it in the next few days. But I would be happy to
test and give some results about it
Cheers

-- 
Alberto Simões
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to