On Fri, Jun 4, 2010 at 1:34 AM, Sean Owen <[email protected]> wrote: > > I would guess so, but that would only make sense if they subtracted it > ahead > > of time. In general, I don't see the point for that. I would rather > cosine > > normalize each user row. > > Yeah sounds good. I wouldn't add this step to start
My quick guess is that normalizing decreases the condition number of the matrix which makes the numerics more stable so you get a better estimate of the singular vectors that you really care about because they aren't shadowed so excessively by the ones associated with the largest singular vectors. The condition number is, among other ways, defined by the ratio of the largest to smallest eigenvalues. Looking at the outer product form of SVD, you can easily see how if the first few singular values total dominate the others that finding the residue represented by the others would be difficult. IDF weighting should have similar effects.
