Hi,

I am getting the below exception when using Spark Kmeans. Any solutions
from the experts. Would be really helpful.

val kMeans = new KMeans().setK(reductionCount).setMaxIter(30)

    val kMeansModel = kMeans.fit(df)

Error is occured when calling kmeans.fit


Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
        at
org.apache.spark.mllib.linalg.SparseVector.toArray(Vectors.scala:760)
        at
org.apache.spark.mllib.clustering.VectorWithNorm.toDense(KMeans.scala:614)
        at
org.apache.spark.mllib.clustering.KMeans$$anonfun$initKMeansParallel$3.apply(KMeans.scala:382)
        at
org.apache.spark.mllib.clustering.KMeans$$anonfun$initKMeansParallel$3.apply(KMeans.scala:382)
        at
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
        at
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
        at
scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
        at
scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:186)
        at
scala.collection.TraversableLike$class.map(TraversableLike.scala:234)
        at scala.collection.mutable.ArrayOps$ofRef.map(ArrayOps.scala:186)
        at
org.apache.spark.mllib.clustering.KMeans.initKMeansParallel(KMeans.scala:382)
        at
org.apache.spark.mllib.clustering.KMeans.runAlgorithm(KMeans.scala:256)
        at org.apache.spark.mllib.clustering.KMeans.run(KMeans.scala:227)
        at org.apache.spark.ml.clustering.KMeans.fit(KMeans.scala:319)
        at
com.datamantra.spark.DataBalancing$.createBalancedDataframe(DataBalancing.scala:25)
        at
com.datamantra.spark.jobs.IftaMLTraining$.trainML$1(IftaMLTraining.scala:182)
        at
com.datamantra.spark.jobs.IftaMLTraining$.main(IftaMLTraining.scala:94)
        at
com.datamantra.spark.jobs.IftaMLTraining.main(IftaMLTraining.scala)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at
org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:738)
        at
org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:187)
        at
org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:212)
        at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:126)
        at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

Thanks,
Asmath

>

Reply via email to