Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/22784#discussion_r228714200 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/linalg/distributed/RowMatrix.scala --- @@ -384,18 +384,28 @@ class RowMatrix @Since("1.0.0") ( val n = numCols().toInt require(k > 0 && k <= n, s"k = $k out of range (0, n = $n]") - val Cov = computeCovariance().asBreeze.asInstanceOf[BDM[Double]] + if (n > 65535) { + val svd = computeSVD(k) + val s = svd.s.toArray.map(eigValue => eigValue * eigValue / (n - 1)) --- End diff -- My linear algebra is probably rusty, so check me here, but we need the eigendecomposition of the covariance matrix 1/(n-1) * mat' * mat. I get how the singular values here give these eigenvalues. Don't the singular vectors V need to be divided by n-1 too? I'm probably wrong about it, just checking, esp. as we don't actually have tests for any of the output here!
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org