> However ... i still think that if you realy want
> a length norm that takes into account the average
> length of the docs, you want one that rewards docs
> for being near the average ...
... like SweetSpotSimilarity (SSS)
> it doesn't seem to make a lot of sense to me to say
> that a doc whose length is N% longer longer then the
> average length is significantly worse the docs whose
> length is N% shorter then the average length.
I don't understand why a doc should be punished for
just having length different from the average length
(i.e. no matter longer or shorter).
The (evolving) way I understand it:
(a) Very long docs are likely to contain everything,
let's punish them to relax this;
(b) This is what the original doc-length-norm
actually does;
(c) But then very short docs might be
rewarded too much;
(d) Now we might get stupid (or erroneous)
few words docs as top results;
(e) To solve this, pivoted doc-length-norm punishes too
long docs (longer than the average) but only slightly
rewards docs that are shorter than the average.
It makes sense to me (IR'ishly if I may say so).
The SSS way does not make sense to me that way.
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]