Hi Doug,

> Perhaps.  Are folks really better at spelling the
> beginning of words?

Yes they are. There were some comprehensive empirical
studies on this topic. Winkler modification on Jaro
string distance is based on this assumption (boosting
similarity if first n, I think 4, chars match).
Jaro-Winkler is well documented and some folks thinks
that it is much more efficient and precise than plain
Edit distance (of course for normal language, not
numbers or so).
I will try to dig-out some references from my disk on
this topic, if you are interested.

On another note,
I would even suggest using Jaro-Winkler distance as
default for fuzzy query. (one could configure max
prefix required => prefix query to reduce number of
distance calculations). This could speed-up fuzzy
search dramatically.

Hope this was helpful,
Eks




  



        
        
                
___________________________________________________________ALL-NEW Yahoo! Messenger - 
all new features - even more fun!  http://uk.messenger.yahoo.com

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to