Another thought on fuzzy scoring: shouldn't all these queries which automatically expand terms favour common words over rare ones? The default scoring behaviour at the moment favours rare words. As a user aren't I more likely to be looking for the most common expansions?
If I'm not sure how to spell I might search for: accomodation~ or accom* The fuzzy scoring algorithms will currently favour all of the mis-spellings of accommodation in the ranking of results because they are more rare. Ideally within the expansions of a term the score contribution should be based on df (as opposed to the usual idf) BUT within the overall query the usual idf scheme applies. To clarify: If I search for: the cheapest accomodation~ in london I want to see the most common spellings of accommodation before all other variants of this word BUT I then want these variants scored against the OTHER words ("in", "the" etc) on the usual basis of rarity. This suggests a sort order within another, different sort order. This seems like it would not be easy to do. Any bright ideas? Cheers Mark ___________________________________________________________ ALL-NEW Yahoo! Messenger - all new features - even more fun! http://uk.messenger.yahoo.com --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]