If someone can demonstrate that an alternate formulation produces superior results for most applications, then we should of course change the default implementation. But just noting that there's a factor which is equal to idf^2 in each element of the sum does not do this.
Dont think that there is a magic formula, but found these papers interesting.
http://www.emeraldinsight.com/rpsv/cgi-bin/emft.pl
Title: Understanding inverse document frequency: on theoretical arguments for IDF Author: Stephen Robertson Pages: 503-520
Title: IDF term weighting and IR research lessons Author: Karen Spärck Jones Pages: 521-523
--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]