I've run into this before. The EigenValueDecomposition creates a Java Array
with 2*k*n elements. The Java Array is indexed with a native integer type,
so 2*k*n cannot exceed Integer.MAX_VALUE values.

The array is created here:
https://github.com/apache/spark/blob/master/mllib/src/
main/scala/org/apache/spark/mllib/linalg/EigenValueDecomposition.scala#L84

If you remove the requirement that 2*k*n<MAX_VALUE statement, it would fail
with java.lang.NegativeArraySizeException. More here on this issue here:
https://issues.apache.org/jira/browse/SPARK-5656

On Tue, Sep 19, 2017 at 9:49 AM, Alexander Ovcharenko <shurik....@gmail.com>
wrote:

> Hello guys,
>
> While trying to compute SVD using computeSVD() function, i am getting the
> following warning with the follow up exception:
> 17/09/14 12:29:02 WARN RowMatrix: computing svd with k=49865 and n=191077,
> please check necessity
> IllegalArgumentException: u'requirement failed: k = 49865 and/or n =
> 191077 are too large to compute an eigendecomposition'
>
> When I try to compute first 3000 singular values, I'm getting several
> following warnings every second:
> 17/09/14 13:43:38 WARN TaskSetManager: Stage 4802 contains a task of very
> large size (135 KB). The maximum recommended task size is 100 KB.
>
> The matrix size is 49865 x 191077 and all the singular values are needed.
>
> Is there a way to lift that limit and be able to compute whatever number
> of singular values?
>
> Thank you.
>
>
>

Reply via email to