again, this had better been sent to the list as well  }:-(
--- Begin Message ---
Am Mittwoch, den 26.04.2006, 23:28 -0700 schrieb Kon Lovett:

> I re-packaged the leventhshein egg last year, not re-writing the  
> existing impl, just extending w/ new procedures.

Uh, looks interesting.

May I suggest to add one more: (levenshtein< s t limit) => boolean.
This is much cheaper than (< (levenshtein-distance s t) limit) , which
in turn is quite useful when ignoring minor typos.

How does it work with UTF-8?  Is (levenshtein-distance "ä" "n") 1?

> Interested in what  
> "obvious optimisation" was missing from the original.

The original, or what I recall about it, doesn't do common
prefix/postfix stripping.  The Levenshtein algorithm is simply applied
to the whole string.  This is unnecessary costly.

> Is the "deadly incompatible with user level threading" due to  
> possible context switch during a match? If so, then this is not  
> currently a problem.

The old code used to feed possibly huge strings to a C procedure, which
could then work almost forever, preventing context switches.  I can't
tell from a quick glance, but did I get you right, that's gone?  Great.

The other problem I recall, but I don't even recall whether this was in
the egg or some other C implementation, was that the whole matrix was
allocated, while only space for the smallest line (the shorter one
columns or rows) is required.

Best regards

/Jörg

--- End Message ---
_______________________________________________
Chicken-users mailing list
Chicken-users@nongnu.org
http://lists.nongnu.org/mailman/listinfo/chicken-users

Reply via email to