Hi Jay, Please try increasing executor memory (if the available memory is more than 2GB) and reduce numBlocks in ALS. The current implementation stores all subproblems in memory and hence the memory requirement is significant when k is large. You can also try reducing k and see whether the problem is still there. I made a PR that improves the ALS implementation, which generates subproblems one by one. You can try that as well.
https://github.com/apache/spark/pull/3720 Best, Xiangrui On Wed, Dec 17, 2014 at 6:57 PM, buring <qyqb...@gmail.com> wrote: > I am not sure this can help you. I have 57 million rating,about 4million user > and 4k items. I used 7-14 total-executor-cores,executal-memory 13g,cluster > have 4 nodes,each have 4cores,max memory 16g. > I found set as follows may help avoid this problem: > conf.set("spark.shuffle.memoryFraction","0.65") //default is 0.2 > conf.set("spark.storage.memoryFraction","0.3")//default is 0.6 > I have to set rank value under 40, otherwise occure this problem. > > > > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/MLLib-ALS-java-lang-OutOfMemoryError-Java-heap-space-tp20584p20755.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > --------------------------------------------------------------------- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org