If someone can demonstrate that an alternate formulation produces superior results for most applications, then we should of course change the default implementation. But just noting that there's a factor which is equal to idf^2 in each element of the sum does not do this.

Dont think that there is a magic formula, but found these papers interesting.
http://www.emeraldinsight.com/rpsv/cgi-bin/emft.pl


Title: Understanding inverse document frequency: on theoretical arguments
for IDF
Author: Stephen Robertson
Pages: 503-520

Title: IDF term weighting and IR research lessons
Author: Karen Spärck Jones
Pages: 521-523



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Reply via email to