[ 
https://issues.apache.org/jira/browse/SPARK-5656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Bittmann updated SPARK-5656:
---------------------------------
    Description: 
Large values of n or k in EigenValueDecomposition.symmetricEigs will fail with 
a NegativeArraySizeException. Specifically, this occurs when 2*n*k > 
Integer.MAX_VALUE. These values are currently unchecked and allow for the array 
to be initialized to a value greater than Integer.MAX_VALUE. I have written the 
below 'require' to fail this condition gracefully. I will submit a pull 
request. 

require(ncv * n.toLong < Integer.MAX_VALUE, "Product of 2*k*n must be smaller 
than " +
      s"Integer.MAX_VALUE. Found required eigenvalues k = $k and matrix 
dimension n = $n")


Here is the exception that occurs from computeSVD with large k and/or n: 

Exception in thread "main" java.lang.NegativeArraySizeException
        at 
org.apache.spark.mllib.linalg.EigenValueDecomposition$.symmetricEigs(EigenValueDecomposition.scala:85)
        at 
org.apache.spark.mllib.linalg.distributed.RowMatrix.computeSVD(RowMatrix.scala:258)
        at 
org.apache.spark.mllib.linalg.distributed.RowMatrix.computeSVD(RowMatrix.scala:190)

  was:
Large values of n or k in EigenValueDecomposition.symmetricEigs will fail with 
a NegativeArraySizeException. Specifically, this occurs when 2*n*k > 
Integer.MAX_VALUE. These values are currently unchecked and allow for the array 
to be initialized to a value greater than Integer.MAX_VALUE. I have written the 
below 'require' to fail this condition gracefully. I will submit a pull 
request. 

require(ncv * n < Integer.MAX_VALUE, "Product of 2*k*n must be smaller than " +
      s"Integer.MAX_VALUE. Found required eigenvalues k = $k and matrix 
dimension n = $n")


Here is the exception that occurs from computeSVD with large k and/or n: 

Exception in thread "main" java.lang.NegativeArraySizeException
        at 
org.apache.spark.mllib.linalg.EigenValueDecomposition$.symmetricEigs(EigenValueDecomposition.scala:85)
        at 
org.apache.spark.mllib.linalg.distributed.RowMatrix.computeSVD(RowMatrix.scala:258)
        at 
org.apache.spark.mllib.linalg.distributed.RowMatrix.computeSVD(RowMatrix.scala:190)


> NegativeArraySizeException in EigenValueDecomposition.symmetricEigs for large 
> n and/or large k
> ----------------------------------------------------------------------------------------------
>
>                 Key: SPARK-5656
>                 URL: https://issues.apache.org/jira/browse/SPARK-5656
>             Project: Spark
>          Issue Type: Bug
>          Components: MLlib
>            Reporter: Mark Bittmann
>            Priority: Minor
>
> Large values of n or k in EigenValueDecomposition.symmetricEigs will fail 
> with a NegativeArraySizeException. Specifically, this occurs when 2*n*k > 
> Integer.MAX_VALUE. These values are currently unchecked and allow for the 
> array to be initialized to a value greater than Integer.MAX_VALUE. I have 
> written the below 'require' to fail this condition gracefully. I will submit 
> a pull request. 
> require(ncv * n.toLong < Integer.MAX_VALUE, "Product of 2*k*n must be smaller 
> than " +
>       s"Integer.MAX_VALUE. Found required eigenvalues k = $k and matrix 
> dimension n = $n")
> Here is the exception that occurs from computeSVD with large k and/or n: 
> Exception in thread "main" java.lang.NegativeArraySizeException
>       at 
> org.apache.spark.mllib.linalg.EigenValueDecomposition$.symmetricEigs(EigenValueDecomposition.scala:85)
>       at 
> org.apache.spark.mllib.linalg.distributed.RowMatrix.computeSVD(RowMatrix.scala:258)
>       at 
> org.apache.spark.mllib.linalg.distributed.RowMatrix.computeSVD(RowMatrix.scala:190)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to