Liang-Chi Hsieh created SPARK-2355:
--------------------------------------

             Summary: Check for the number of clusters to avoid 
ArrayIndexOutOfBoundsException
                 Key: SPARK-2355
                 URL: https://issues.apache.org/jira/browse/SPARK-2355
             Project: Spark
          Issue Type: Bug
          Components: MLlib
    Affects Versions: 1.0.0
            Reporter: Liang-Chi Hsieh


When the number of clusters given to perform with 
org.apache.spark.mllib.clustering.KMeans under parallel initial mode is greater 
than data number, it will throw ArrayIndexOutOfBoundsException.

KMeans class should check the number of clusters that must not be greater than 
data number.

Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: -1
        at 
org.apache.spark.mllib.clustering.LocalKMeans$$anonfun$kMeansPlusPlus$1.apply$mcVI$sp(LocalKMeans.scala:62)
        at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:141)
        at 
org.apache.spark.mllib.clustering.LocalKMeans$.kMeansPlusPlus(LocalKMeans.scala:49)
        at 
org.apache.spark.mllib.clustering.KMeans$$anonfun$20.apply(KMeans.scala:297)
        at 
org.apache.spark.mllib.clustering.KMeans$$anonfun$20.apply(KMeans.scala:294)
        at 
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
        at 
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
        at scala.collection.immutable.Range.foreach(Range.scala:141)
        at scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
        at scala.collection.AbstractTraversable.map(Traversable.scala:105)
        at 
org.apache.spark.mllib.clustering.KMeans.initKMeansParallel(KMeans.scala:294)
        at org.apache.spark.mllib.clustering.KMeans.runBreeze(KMeans.scala:143)
        at org.apache.spark.mllib.clustering.KMeans.run(KMeans.scala:126)
        at 
org.apache.spark.examples.mllib.DenseKMeans$.run(DenseKMeans.scala:102)
        at 
org.apache.spark.examples.mllib.DenseKMeans$$anonfun$main$1.apply(DenseKMeans.scala:72)
        at 
org.apache.spark.examples.mllib.DenseKMeans$$anonfun$main$1.apply(DenseKMeans.scala:71)
        at scala.Option.map(Option.scala:145)
        at 
org.apache.spark.examples.mllib.DenseKMeans$.main(DenseKMeans.scala:71)
        at org.apache.spark.examples.mllib.DenseKMeans.main(DenseKMeans.scala)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at org.apache.spark.deploy.SparkSubmit$.launch(SparkSubmit.scala:303)
        at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:55)
        at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)




--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to