[
https://issues.apache.org/jira/browse/SPARK-5656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Mark Bittmann updated SPARK-5656:
-
Description:
Large values of n or k in EigenValueDecomposition.symmetricEigs will fail with
a NegativeArraySizeException. Specifically, this occurs when 2*n*k >
Integer.MAX_VALUE. These values are currently unchecked and allow for the array
to be initialized to a value greater than Integer.MAX_VALUE. I have written the
below 'require' to fail this condition gracefully. I will submit a pull
request.
require(ncv * n.toLong < Integer.MAX_VALUE, "Product of 2*k*n must be smaller
than " +
s"Integer.MAX_VALUE. Found required eigenvalues k = $k and matrix
dimension n = $n")
Here is the exception that occurs from computeSVD with large k and/or n:
Exception in thread "main" java.lang.NegativeArraySizeException
at
org.apache.spark.mllib.linalg.EigenValueDecomposition$.symmetricEigs(EigenValueDecomposition.scala:85)
at
org.apache.spark.mllib.linalg.distributed.RowMatrix.computeSVD(RowMatrix.scala:258)
at
org.apache.spark.mllib.linalg.distributed.RowMatrix.computeSVD(RowMatrix.scala:190)
was:
Large values of n or k in EigenValueDecomposition.symmetricEigs will fail with
a NegativeArraySizeException. Specifically, this occurs when 2*n*k >
Integer.MAX_VALUE. These values are currently unchecked and allow for the array
to be initialized to a value greater than Integer.MAX_VALUE. I have written the
below 'require' to fail this condition gracefully. I will submit a pull
request.
require(ncv * n < Integer.MAX_VALUE, "Product of 2*k*n must be smaller than " +
s"Integer.MAX_VALUE. Found required eigenvalues k = $k and matrix
dimension n = $n")
Here is the exception that occurs from computeSVD with large k and/or n:
Exception in thread "main" java.lang.NegativeArraySizeException
at
org.apache.spark.mllib.linalg.EigenValueDecomposition$.symmetricEigs(EigenValueDecomposition.scala:85)
at
org.apache.spark.mllib.linalg.distributed.RowMatrix.computeSVD(RowMatrix.scala:258)
at
org.apache.spark.mllib.linalg.distributed.RowMatrix.computeSVD(RowMatrix.scala:190)
> NegativeArraySizeException in EigenValueDecomposition.symmetricEigs for large
> n and/or large k
> --
>
> Key: SPARK-5656
> URL: https://issues.apache.org/jira/browse/SPARK-5656
> Project: Spark
> Issue Type: Bug
> Components: MLlib
>Reporter: Mark Bittmann
>Priority: Minor
>
> Large values of n or k in EigenValueDecomposition.symmetricEigs will fail
> with a NegativeArraySizeException. Specifically, this occurs when 2*n*k >
> Integer.MAX_VALUE. These values are currently unchecked and allow for the
> array to be initialized to a value greater than Integer.MAX_VALUE. I have
> written the below 'require' to fail this condition gracefully. I will submit
> a pull request.
> require(ncv * n.toLong < Integer.MAX_VALUE, "Product of 2*k*n must be smaller
> than " +
> s"Integer.MAX_VALUE. Found required eigenvalues k = $k and matrix
> dimension n = $n")
> Here is the exception that occurs from computeSVD with large k and/or n:
> Exception in thread "main" java.lang.NegativeArraySizeException
> at
> org.apache.spark.mllib.linalg.EigenValueDecomposition$.symmetricEigs(EigenValueDecomposition.scala:85)
> at
> org.apache.spark.mllib.linalg.distributed.RowMatrix.computeSVD(RowMatrix.scala:258)
> at
> org.apache.spark.mllib.linalg.distributed.RowMatrix.computeSVD(RowMatrix.scala:190)
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org