Should we change the scoring behaviour of FuzzyQuery? The current approach of turning Foo~ into a large boolean query means that result scores are heavily diluted for matches.
In my tests a search for Foo returns documents containing Foo with a score of 1. A search for Foo~ returns documents containing Foo with a score of just 0.01 (this was the top score). I know Lucene scoring isn't guaranteed to consistently return values in the range of 0 to 1 but I think we should make some attempts to avoid scoring insconsistencies like the one above. To this end, I have tried changing FuzzyQuery to internally use this class to ignore the coordination factor in scores (the number of terms in query): class FuzzyBooleanQuery extends BooleanQuery { public Similarity getSimilarity(Searcher searcher) { return new DefaultSimilarity(){ public float coord(int overlap, int maxOverlap) { return 1; } }; } } This seems to produce more realistic scores and looks to preserve the same sort order. Any views? Cheers Mark ___________________________________________________________ ALL-NEW Yahoo! Messenger - all new features - even more fun! http://uk.messenger.yahoo.com --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]