Karl Koch wrote: > Is there any other paper that actually shows the benefit of doing > this particular normalisation with coord_q_d? I am not suggesting > here that it is not useful, I am just looking for evidence how the > idea developed.
I think it's a mischaracterization to call coordination a "normalization". In my mind, "normalization" is something applied equally to all documents' scores. The coordination component of a document's score varies from document to document, and so doesn't meet this criterion. I repeat the citation of the book cited by the paper I cited :) : >> Salton, G. & McGill, M. Introduction to Modern Information >> Retrieval. McGraw-Hill, New York, 1983. In addition to the above book, here are two other books that I've seen cited as describing "coordination-level matching" (a.k.a. "overlap ranking"): Salton, G. (1968). Automatic information organization and retrieval. New York: McGraw-Hill. Lancaster, F.W. (1979). Information retrieval systems: Characteristics, testing and evaluation (2nd ed.). New York: Wiley. I don't know the answer to your larger question: why use a coordination component in a similarity measure when other components (tf*idf) seem to serve the same function? What you seem to be looking for is a study that directly compares a system using a coordination component in its similarity measure with the *same* system, varying the measure only in that coordination is elided. Unfortunately, I know of no such study. Good luck, Steve --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]