Hi,How many clients and how many products do you have?CheersGen jaykatukuri wrote > Hi all,I am running into an out of memory error while running ALS using > MLLIB on a reasonably small data set consisting of around 6 Million > ratings.The stack trace is below:java.lang.OutOfMemoryError: Java heap > space at org.jblas.DoubleMatrix.
> (DoubleMatrix.java:323) at > org.jblas.DoubleMatrix.zeros(DoubleMatrix.java:471) at > org.jblas.DoubleMatrix.zeros(DoubleMatrix.java:476) at > org.apache.spark.mllib.recommendation.ALS$$anonfun$21.apply(ALS.scala:465) > at > org.apache.spark.mllib.recommendation.ALS$$anonfun$21.apply(ALS.scala:465) > at scala.Array$.fill(Array.scala:267) at > org.apache.spark.mllib.recommendation.ALS.org$apache$spark$mllib$recommendation$ALS$$updateBlock(ALS.scala:465) > at > org.apache.spark.mllib.recommendation.ALS$$anonfun$org$apache$spark$mllib$recommendation$ALS$$updateFeatures$2.apply(ALS.scala:445) > at > org.apache.spark.mllib.recommendation.ALS$$anonfun$org$apache$spark$mllib$recommendation$ALS$$updateFeatures$2.apply(ALS.scala:444) > at > org.apache.spark.rdd.MappedValuesRDD$$anonfun$compute$1.apply(MappedValuesRDD.scala:31) > at > org.apache.spark.rdd.MappedValuesRDD$$anonfun$compute$1.apply(MappedValuesRDD.scala:31) > at scala.collection.Iterator$$anon$11.next(Iterator.scala:328) at > org.apache.spark.rdd.CoGroupedRDD$$anonfun$compute$4.apply(CoGroupedRDD.scala:156) > at > org.apache.spark.rdd.CoGroupedRDD$$anonfun$compute$4.apply(CoGroupedRDD.scala:154) > at > scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) > at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47) at > org.apache.spark.rdd.CoGroupedRDD.compute(CoGroupedRDD.scala:154) at > org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262) at > org.apache.spark.rdd.RDD.iterator(RDD.scala:229) at > org.apache.spark.rdd.MappedValuesRDD.compute(MappedValuesRDD.scala:31) > at > org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262) at > org.apache.spark.rdd.RDD.iterator(RDD.scala:229) at > org.apache.spark.rdd.FlatMappedValuesRDD.compute(FlatMappedValuesRDD.scala:31) > at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262) at > org.apache.spark.rdd.RDD.iterator(RDD.scala:229) at > org.apache.spark.rdd.FlatMappedRDD.compute(FlatMappedRDD.scala:33) at > org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262) at > org.apache.spark.rdd.RDD.iterator(RDD.scala:229) at > org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:158) > at > org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:99) > at org.apache.spark.scheduler.Task.run(Task.scala:51) at > org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:187)I am > using 2GB for executors memory. I tried with 100 executors.Can some one > please point me in the right direction ?Thanks,Jay -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/MLLib-ALS-java-lang-OutOfMemoryError-Java-heap-space-tp20584p20714.html Sent from the Apache Spark User List mailing list archive at Nabble.com.