Jake, The Lanczos algorithm is designed for eigen-decomposition, but like any such algorithm, getting singular vectors out of it is immediate (singular vectors of matrix A are just the eigenvectors of A^t * A or A * A^t). Lanczos works by taking a starting seed vector *v* (with cardinality equal to the number of columns of the matrix A), and repeatedly multiplying A by the result: *v'* = A.times(*v*). In the case where A is not square (in general: not symmetric), then you actually want to repeatedly multiply A*A^t by *v*: *v'* = (A * A^t).times(*v*), or equivalently, in Mahout, A.timesSquared(*v*) (timesSquared is merely an optimization: by changing the order of summation in A*A^t.times(*v*), you can do the same computation as one pass over the rows of A instead of two).
This sounds like Krylov iteration rather than Lanczos' algorithm. Is this just an issue with the description? I was under the impression that Lanczos' and Arnoldi methods have much better numerical stability than raw Krylov iteration. Is there some improvement to be had here? -- Ted Dunning, CTO DeepDyve