subject:"Increase maximum amount of columns for covariance matrix for principal components"

Re: Increase maximum amount of columns for covariance matrix for principal components

2015-05-19 Thread Xiangrui Meng

We use a dense array to store the covariance matrix on the driver node. So its length is limited by the integer range, which is 65536 * 65536 (actually half). -Xiangrui On Wed, May 13, 2015 at 1:57 AM, Sebastian Alfers sebastian.alf...@googlemail.com wrote: Hello, in order to compute a huge

Increase maximum amount of columns for covariance matrix for principal components

2015-05-13 Thread Sebastian Alfers

Hello, in order to compute a huge dataset, the amount of columns to calculate the covariance matrix is limited: https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/mllib/linalg/distributed/RowMatrix.scala#L129 What is the reason behind this limitation and can it