Hi Apurva,
if you consider the operations done by the augmented frequency and the
cosine normalization independently from everything else, they are
somewhat similar. The normalization by max in a p-norm with pā+ā . So
apart from the 0.5 offset, both are can be seen document length
normalizati
Hello,
Could anybody tell me the difference between using augmented frequency
(which is used for weighting term frequencies to eliminate the bias towards
larger documents) and cosine normalization (l2 norm which scikit-learn uses
for TfidfTransformer).
Augmented frequency is given by the following