Sorry about this. This was fixed by increasing the allocated memory as
mentioned in the `tweaking the benchmark` section. Thanks

On Mon, Feb 29, 2016 at 1:55 PM, Mohammad Ahmad <[email protected]>
wrote:

> Hi guys,
>
> I am just getting started with the cloudsuite benchmark. I am having some
> trouble running the in-memory-analytics benchmark. I get a
> 'java.lang.OutOfMemoryError: Java heap space' exception. Any suggestions on
> how I change things to make this work? How much heap space memory is
> sufficient? Also, will changes be required in the container or the host?
> Thanks!
>
> sudo docker run --rm --volumes-from data cloudsuite/in-memory-analytics
> /data/ml-latest /data/myratings.csv
> Using Spark's default log4j profile:
> org/apache/spark/log4j-defaults.properties
> 16/02/29 19:37:58 WARN NativeCodeLoader: Unable to load native-hadoop
> library for your platform... using builtin-java classes where applicable
> 16/02/29 19:37:59 INFO Slf4jLogger: Slf4jLogger started
> 16/02/29 19:37:59 INFO Remoting: Starting remoting
> 16/02/29 19:37:59 INFO Remoting: Remoting started; listening on addresses
> :[akka.tcp://[email protected]:50305]
> 16/02/29 19:37:59 WARN MetricsSystem: Using default name DAGScheduler for
> source because spark.app.id is not set.
> 16/02/29 19:38:00 INFO FileInputFormat: Total input paths to process : 1
> 16/02/29 19:38:00 INFO deprecation: mapred.tip.id is deprecated. Instead,
> use mapreduce.task.id
> 16/02/29 19:38:00 INFO deprecation: mapred.task.id is deprecated.
> Instead, use mapreduce.task.attempt.id
> 16/02/29 19:38:00 INFO deprecation: mapred.task.is.map is deprecated.
> Instead, use mapreduce.task.ismap
> 16/02/29 19:38:00 INFO deprecation: mapred.task.partition is deprecated.
> Instead, use mapreduce.task.partition
> 16/02/29 19:38:00 INFO deprecation: mapred.job.id is deprecated. Instead,
> use mapreduce.job.id
> 16/02/29 19:38:01 INFO FileInputFormat: Total input paths to process : 1
> Got 22884377 ratings from 247753 users on 33670 movies.
> [Stage 9:=============================>                             (2 +
> 2) / 4]16/02/29 19:38:25 WARN MemoryStore: Not enough space to cache
> rdd_23_2 in memory! (computed 99.2 MB so far)
> [Stage 11:>                                                         (0 +
> 4) / 4]16/02/29 19:38:33 WARN MemoryStore: Not enough space to cache
> rdd_29_2 in memory! (computed 43.2 MB so far)
> 16/02/29 19:38:33 WARN MemoryStore: Not enough space to cache rdd_29_3 in
> memory! (computed 43.2 MB so far)
> [Stage 11:==============>                                           (1 +
> 3) / 4]16/02/29 19:38:33 WARN MemoryStore: Not enough space to cache
> rdd_29_1 in memory! (computed 43.2 MB so far)
> [Stage 12:>                                                       (0 + 16)
> / 19]16/02/29 19:38:36 WARN MemoryStore: Not enough space to cache rdd_31_9
> in memory! (computed 6.2 MB so far)
> 16/02/29 19:38:36 WARN MemoryStore: Not enough space to cache rdd_31_12 in
> memory! (computed 6.2 MB so far)
> 16/02/29 19:38:37 WARN MemoryStore: Not enough space to cache rdd_31_10 in
> memory! (computed 6.2 MB so far)
> 16/02/29 19:38:37 WARN MemoryStore: Not enough space to cache rdd_31_6 in
> memory! (computed 6.2 MB so far)
> Training: 13731669, validation: 4574414, test: 4578305
> [Stage 14:===========================================>              (3 +
> 1) / 4]16/02/29 19:38:42 WARN MemoryStore: Not enough space to cache
> rdd_23_2 in memory! (computed 99.2 MB so far)
> [Stage 15:>                                                         (0 +
> 4) / 4]16/02/29 19:38:45 WARN MemoryStore: Not enough space to cache
> rdd_35_3 in memory! (computed 19.8 MB so far)
> 16/02/29 19:38:45 WARN CacheManager: Persisting partition rdd_35_3 to disk
> instead.
> [Stage 16:>                                                       (0 + 16)
> / 16]16/02/29 19:38:56
>
> *ERROR Executor: Exception in task 6.0 in stage 16.0 (TID 179)*
> java.lang.OutOfMemoryError: Java heap space
> at
> scala.collection.mutable.ArrayBuilder$ofInt.mkArray(ArrayBuilder.scala:320)
> at
> scala.collection.mutable.ArrayBuilder$ofInt.resize(ArrayBuilder.scala:326)
> at
> scala.collection.mutable.ArrayBuilder$ofInt.ensureSize(ArrayBuilder.scala:338)
> at
> scala.collection.mutable.ArrayBuilder$ofInt.$plus$eq(ArrayBuilder.scala:343)
> at
> scala.collection.mutable.ArrayBuilder$ofInt.$plus$eq(ArrayBuilder.scala:313)
> at
> org.apache.spark.ml.recommendation.ALS$UncompressedInBlockBuilder.add(ALS.scala:851)
> at
> org.apache.spark.ml.recommendation.ALS$$anonfun$15$$anonfun$apply$11.apply(ALS.scala:1066)
> at
> org.apache.spark.ml.recommendation.ALS$$anonfun$15$$anonfun$apply$11.apply(ALS.scala:1065)
> at scala.collection.Iterator$class.foreach(Iterator.scala:727)
> at
> org.apache.spark.util.collection.CompactBuffer$$anon$1.foreach(CompactBuffer.scala:115)
> at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
> at
> org.apache.spark.util.collection.CompactBuffer.foreach(CompactBuffer.scala:30)
> at org.apache.spark.ml.recommendation.ALS$$anonfun$15.apply(ALS.scala:1065)
> at org.apache.spark.ml.recommendation.ALS$$anonfun$15.apply(ALS.scala:1062)
> at
> org.apache.spark.rdd.PairRDDFunctions$$anonfun$mapValues$1$$anonfun$apply$41$$anonfun$apply$42.apply(PairRDDFunctions.scala:700)
> at
> org.apache.spark.rdd.PairRDDFunctions$$anonfun$mapValues$1$$anonfun$apply$41$$anonfun$apply$42.apply(PairRDDFunctions.scala:700)
> at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
> at org.apache.spark.storage.MemoryStore.unrollSafely(MemoryStore.scala:278)
> at org.apache.spark.CacheManager.putInBlockManager(CacheManager.scala:171)
> at org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:78)
> at org.apache.spark.rdd.RDD.iterator(RDD.scala:262)
> at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
> at org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:69)
> at org.apache.spark.rdd.RDD.iterator(RDD.scala:262)
> at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
> at org.apache.spark.scheduler.Task.run(Task.scala:88)
> at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> 16/02/29 19:39:01 ERROR Executor: Exception in task 3.0 in stage 16.0 (TID
> 176)
> java.lang.OutOfMemoryError: Java heap space
> at
> scala.collection.mutable.ArrayBuilder$ofInt.mkArray(ArrayBuilder.scala:320)
> at
> scala.collection.mutable.ArrayBuilder$ofInt.result(ArrayBuilder.scala:365)
> at
> scala.collection.mutable.ArrayBuilder$ofInt.result(ArrayBuilder.scala:313)
> at
> org.apache.spark.ml.recommendation.ALS$UncompressedInBlockBuilder.build(ALS.scala:859)
> at org.apache.spark.ml.recommendation.ALS$$anonfun$15.apply(ALS.scala:1068)
> at org.apache.spark.ml.recommendation.ALS$$anonfun$15.apply(ALS.scala:1062)
> at
> org.apache.spark.rdd.PairRDDFunctions$$anonfun$mapValues$1$$anonfun$apply$41$$anonfun$apply$42.apply(PairRDDFunctions.scala:700)
> at
> org.apache.spark.rdd.PairRDDFunctions$$anonfun$mapValues$1$$anonfun$apply$41$$anonfun$apply$42.apply(PairRDDFunctions.scala:700)
> at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
> at org.apache.spark.storage.MemoryStore.unrollSafely(MemoryStore.scala:278)
> at org.apache.spark.CacheManager.putInBlockManager(CacheManager.scala:171)
> at org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:78)
> at org.apache.spark.rdd.RDD.iterator(RDD.scala:262)
> at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
> at org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:69)
> at org.apache.spark.rdd.RDD.iterator(RDD.scala:262)
> at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
> at org.apache.spark.scheduler.Task.run(Task.scala:88)
> at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> 16/02/29 19:39:00 ERROR Executor: Exception in task 1.0 in stage 16.0 (TID
> 174)
> java.lang.OutOfMemoryError: Java heap space
> at
> scala.collection.mutable.ArrayBuilder$ofFloat.mkArray(ArrayBuilder.scala:448)
> at
> scala.collection.mutable.ArrayBuilder$ofFloat.result(ArrayBuilder.scala:493)
> at
> scala.collection.mutable.ArrayBuilder$ofFloat.result(ArrayBuilder.scala:441)
> at
> org.apache.spark.ml.recommendation.ALS$UncompressedInBlockBuilder.build(ALS.scala:859)
> at org.apache.spark.ml.recommendation.ALS$$anonfun$15.apply(ALS.scala:1068)
> at org.apache.spark.ml.recommendation.ALS$$anonfun$15.apply(ALS.scala:1062)
> at
> org.apache.spark.rdd.PairRDDFunctions$$anonfun$mapValues$1$$anonfun$apply$41$$anonfun$apply$42.apply(PairRDDFunctions.scala:700)
> at
> org.apache.spark.rdd.PairRDDFunctions$$anonfun$mapValues$1$$anonfun$apply$41$$anonfun$apply$42.apply(PairRDDFunctions.scala:700)
> at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
> at org.apache.spark.storage.MemoryStore.unrollSafely(MemoryStore.scala:278)
> at org.apache.spark.CacheManager.putInBlockManager(CacheManager.scala:171)
> at org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:78)
> at org.apache.spark.rdd.RDD.iterator(RDD.scala:262)
> at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
> at org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:69)
> at org.apache.spark.rdd.RDD.iterator(RDD.scala:262)
> at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
> at org.apache.spark.scheduler.Task.run(Task.scala:88)
> at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> 16/02/29 19:39:00 ERROR Executor: Exception in task 7.0 in stage 16.0 (TID
> 180)
> java.lang.OutOfMemoryError: Java heap space
> at
> scala.collection.mutable.ArrayBuilder$ofFloat.mkArray(ArrayBuilder.scala:448)
> at
> scala.collection.mutable.ArrayBuilder$ofFloat.result(ArrayBuilder.scala:493)
> at
> scala.collection.mutable.ArrayBuilder$ofFloat.result(ArrayBuilder.scala:441)
> at
> org.apache.spark.ml.recommendation.ALS$UncompressedInBlockBuilder.build(ALS.scala:859)
> at org.apache.spark.ml.recommendation.ALS$$anonfun$15.apply(ALS.scala:1068)
> at org.apache.spark.ml.recommendation.ALS$$anonfun$15.apply(ALS.scala:1062)
> at
> org.apache.spark.rdd.PairRDDFunctions$$anonfun$mapValues$1$$anonfun$apply$41$$anonfun$apply$42.apply(PairRDDFunctions.scala:700)
> at
> org.apache.spark.rdd.PairRDDFunctions$$anonfun$mapValues$1$$anonfun$apply$41$$anonfun$apply$42.apply(PairRDDFunctions.scala:700)
> at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
> at org.apache.spark.storage.MemoryStore.unrollSafely(MemoryStore.scala:278)
> at org.apache.spark.CacheManager.putInBlockManager(CacheManager.scala:171)
> at org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:78)
> at org.apache.spark.rdd.RDD.iterator(RDD.scala:262)
> at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
> at org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:69)
> at org.apache.spark.rdd.RDD.iterator(RDD.scala:262)
> at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
> at org.apache.spark.scheduler.Task.run(Task.scala:88)
> at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> 16/02/29 19:39:01 ERROR SparkUncaughtExceptionHandler: Uncaught exception
> in thread Thread[Executor task launch worker-3,5,main]
> java.lang.OutOfMemoryError: Java heap space
> at
> scala.collection.mutable.ArrayBuilder$ofFloat.mkArray(ArrayBuilder.scala:448)
> at
> scala.collection.mutable.ArrayBuilder$ofFloat.result(ArrayBuilder.scala:493)
> at
> scala.collection.mutable.ArrayBuilder$ofFloat.result(ArrayBuilder.scala:441)
> at
> org.apache.spark.ml.recommendation.ALS$UncompressedInBlockBuilder.build(ALS.scala:859)
> at org.apache.spark.ml.recommendation.ALS$$anonfun$15.apply(ALS.scala:1068)
> at org.apache.spark.ml.recommendation.ALS$$anonfun$15.apply(ALS.scala:1062)
> at
> org.apache.spark.rdd.PairRDDFunctions$$anonfun$mapValues$1$$anonfun$apply$41$$anonfun$apply$42.apply(PairRDDFunctions.scala:700)
> at
> org.apache.spark.rdd.PairRDDFunctions$$anonfun$mapValues$1$$anonfun$apply$41$$anonfun$apply$42.apply(PairRDDFunctions.scala:700)
> at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
> at org.apache.spark.storage.MemoryStore.unrollSafely(MemoryStore.scala:278)
> at org.apache.spark.CacheManager.putInBlockManager(CacheManager.scala:171)
> at org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:78)
> at org.apache.spark.rdd.RDD.iterator(RDD.scala:262)
> at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
> at org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:69)
> at org.apache.spark.rdd.RDD.iterator(RDD.scala:262)
> at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
> at org.apache.spark.scheduler.Task.run(Task.scala:88)
> at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> 16/02/29 19:39:01 ERROR SparkUncaughtExceptionHandler: Uncaught exception
> in thread Thread[Executor task launch worker-11,5,main]
> java.lang.OutOfMemoryError: Java heap space
> at
> scala.collection.mutable.ArrayBuilder$ofFloat.mkArray(ArrayBuilder.scala:448)
> at
> scala.collection.mutable.ArrayBuilder$ofFloat.result(ArrayBuilder.scala:493)
> at
> scala.collection.mutable.ArrayBuilder$ofFloat.result(ArrayBuilder.scala:441)
> at
> org.apache.spark.ml.recommendation.ALS$UncompressedInBlockBuilder.build(ALS.scala:859)
> at org.apache.spark.ml.recommendation.ALS$$anonfun$15.apply(ALS.scala:1068)
> at org.apache.spark.ml.recommendation.ALS$$anonfun$15.apply(ALS.scala:1062)
> at
> org.apache.spark.rdd.PairRDDFunctions$$anonfun$mapValues$1$$anonfun$apply$41$$anonfun$apply$42.apply(PairRDDFunctions.scala:700)
> at
> org.apache.spark.rdd.PairRDDFunctions$$anonfun$mapValues$1$$anonfun$apply$41$$anonfun$apply$42.apply(PairRDDFunctions.scala:700)
> at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
> at org.apache.spark.storage.MemoryStore.unrollSafely(MemoryStore.scala:278)
> at org.apache.spark.CacheManager.putInBlockManager(CacheManager.scala:171)
> at org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:78)
> at org.apache.spark.rdd.RDD.iterator(RDD.scala:262)
> at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
> at org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:69)
> at org.apache.spark.rdd.RDD.iterator(RDD.scala:262)
> at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
> at org.apache.spark.scheduler.Task.run(Task.scala:88)
> at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> 16/02/29 19:39:01 ERROR SparkUncaughtExceptionHandler: Uncaught exception
> in thread Thread[Executor task launch worker-4,5,main]
> java.lang.OutOfMemoryError: Java heap space
> at
> scala.collection.mutable.ArrayBuilder$ofInt.mkArray(ArrayBuilder.scala:320)
> at
> scala.collection.mutable.ArrayBuilder$ofInt.resize(ArrayBuilder.scala:326)
> at
> scala.collection.mutable.ArrayBuilder$ofInt.ensureSize(ArrayBuilder.scala:338)
> at
> scala.collection.mutable.ArrayBuilder$ofInt.$plus$eq(ArrayBuilder.scala:343)
> at
> scala.collection.mutable.ArrayBuilder$ofInt.$plus$eq(ArrayBuilder.scala:313)
> at
> org.apache.spark.ml.recommendation.ALS$UncompressedInBlockBuilder.add(ALS.scala:851)
> at
> org.apache.spark.ml.recommendation.ALS$$anonfun$15$$anonfun$apply$11.apply(ALS.scala:1066)
> at
> org.apache.spark.ml.recommendation.ALS$$anonfun$15$$anonfun$apply$11.apply(ALS.scala:1065)
> at scala.collection.Iterator$class.foreach(Iterator.scala:727)
> at
> org.apache.spark.util.collection.CompactBuffer$$anon$1.foreach(CompactBuffer.scala:115)
> at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
> at
> org.apache.spark.util.collection.CompactBuffer.foreach(CompactBuffer.scala:30)
> at org.apache.spark.ml.recommendation.ALS$$anonfun$15.apply(ALS.scala:1065)
> at org.apache.spark.ml.recommendation.ALS$$anonfun$15.apply(ALS.scala:1062)
> at
> org.apache.spark.rdd.PairRDDFunctions$$anonfun$mapValues$1$$anonfun$apply$41$$anonfun$apply$42.apply(PairRDDFunctions.scala:700)
> at
> org.apache.spark.rdd.PairRDDFunctions$$anonfun$mapValues$1$$anonfun$apply$41$$anonfun$apply$42.apply(PairRDDFunctions.scala:700)
> at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
> at org.apache.spark.storage.MemoryStore.unrollSafely(MemoryStore.scala:278)
> at org.apache.spark.CacheManager.putInBlockManager(CacheManager.scala:171)
> at org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:78)
> at org.apache.spark.rdd.RDD.iterator(RDD.scala:262)
> at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
> at org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:69)
> at org.apache.spark.rdd.RDD.iterator(RDD.scala:262)
> at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
> at org.apache.spark.scheduler.Task.run(Task.scala:88)
> at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> 16/02/29 19:39:01 ERROR SparkUncaughtExceptionHandler: Uncaught exception
> in thread Thread[Executor task launch worker-1,5,main]
> java.lang.OutOfMemoryError: Java heap space
> at
> scala.collection.mutable.ArrayBuilder$ofInt.mkArray(ArrayBuilder.scala:320)
> at
> scala.collection.mutable.ArrayBuilder$ofInt.result(ArrayBuilder.scala:365)
> at
> scala.collection.mutable.ArrayBuilder$ofInt.result(ArrayBuilder.scala:313)
> at
> org.apache.spark.ml.recommendation.ALS$UncompressedInBlockBuilder.build(ALS.scala:859)
> at org.apache.spark.ml.recommendation.ALS$$anonfun$15.apply(ALS.scala:1068)
> at org.apache.spark.ml.recommendation.ALS$$anonfun$15.apply(ALS.scala:1062)
> at
> org.apache.spark.rdd.PairRDDFunctions$$anonfun$mapValues$1$$anonfun$apply$41$$anonfun$apply$42.apply(PairRDDFunctions.scala:700)
> at
> org.apache.spark.rdd.PairRDDFunctions$$anonfun$mapValues$1$$anonfun$apply$41$$anonfun$apply$42.apply(PairRDDFunctions.scala:700)
> at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
> at org.apache.spark.storage.MemoryStore.unrollSafely(MemoryStore.scala:278)
> at org.apache.spark.CacheManager.putInBlockManager(CacheManager.scala:171)
> at org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:78)
> at org.apache.spark.rdd.RDD.iterator(RDD.scala:262)
> at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
> at org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:69)
> at org.apache.spark.rdd.RDD.iterator(RDD.scala:262)
> at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
> at org.apache.spark.scheduler.Task.run(Task.scala:88)
> at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> 16/02/29 19:39:02 WARN TaskSetManager: Lost task 7.0 in stage 16.0 (TID
> 180, localhost): java.lang.OutOfMemoryError: Java heap space
> at
> scala.collection.mutable.ArrayBuilder$ofFloat.mkArray(ArrayBuilder.scala:448)
> at
> scala.collection.mutable.ArrayBuilder$ofFloat.result(ArrayBuilder.scala:493)
> at
> scala.collection.mutable.ArrayBuilder$ofFloat.result(ArrayBuilder.scala:441)
> at
> org.apache.spark.ml.recommendation.ALS$UncompressedInBlockBuilder.build(ALS.scala:859)
> at org.apache.spark.ml.recommendation.ALS$$anonfun$15.apply(ALS.scala:1068)
> at org.apache.spark.ml.recommendation.ALS$$anonfun$15.apply(ALS.scala:1062)
> at
> org.apache.spark.rdd.PairRDDFunctions$$anonfun$mapValues$1$$anonfun$apply$41$$anonfun$apply$42.apply(PairRDDFunctions.scala:700)
> at
> org.apache.spark.rdd.PairRDDFunctions$$anonfun$mapValues$1$$anonfun$apply$41$$anonfun$apply$42.apply(PairRDDFunctions.scala:700)
> at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
> at org.apache.spark.storage.MemoryStore.unrollSafely(MemoryStore.scala:278)
> at org.apache.spark.CacheManager.putInBlockManager(CacheManager.scala:171)
> at org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:78)
> at org.apache.spark.rdd.RDD.iterator(RDD.scala:262)
> at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
> at org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:69)
> at org.apache.spark.rdd.RDD.iterator(RDD.scala:262)
> at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
> at org.apache.spark.scheduler.Task.run(Task.scala:88)
> at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
>
> 16/02/29 19:39:03 ERROR TaskSetManager: Task 7 in stage 16.0 failed 1
> times; aborting job
> Exception in thread "main" org.apache.spark.SparkException: Job aborted
> due to stage failure: Task 7 in stage 16.0 failed 1 times, most recent
> failure: Lost task 7.0 in stage 16.0 (TID 180, localhost):
> java.lang.OutOfMemoryError: Java heap space
> at
> scala.collection.mutable.ArrayBuilder$ofFloat.mkArray(ArrayBuilder.scala:448)
> at
> scala.collection.mutable.ArrayBuilder$ofFloat.result(ArrayBuilder.scala:493)
> at
> scala.collection.mutable.ArrayBuilder$ofFloat.result(ArrayBuilder.scala:441)
> at
> org.apache.spark.ml.recommendation.ALS$UncompressedInBlockBuilder.build(ALS.scala:859)
> at org.apache.spark.ml.recommendation.ALS$$anonfun$15.apply(ALS.scala:1068)
> at org.apache.spark.ml.recommendation.ALS$$anonfun$15.apply(ALS.scala:1062)
> at
> org.apache.spark.rdd.PairRDDFunctions$$anonfun$mapValues$1$$anonfun$apply$41$$anonfun$apply$42.apply(PairRDDFunctions.scala:700)
> at
> org.apache.spark.rdd.PairRDDFunctions$$anonfun$mapValues$1$$anonfun$apply$41$$anonfun$apply$42.apply(PairRDDFunctions.scala:700)
> at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
> at org.apache.spark.storage.MemoryStore.unrollSafely(MemoryStore.scala:278)
> at org.apache.spark.CacheManager.putInBlockManager(CacheManager.scala:171)
> at org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:78)
> at org.apache.spark.rdd.RDD.iterator(RDD.scala:262)
> at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
> at org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:69)
> at org.apache.spark.rdd.RDD.iterator(RDD.scala:262)
> at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
> at org.apache.spark.scheduler.Task.run(Task.scala:88)
> at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
>
> Driver stacktrace:
> at org.apache.spark.scheduler.DAGScheduler.org
> $apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1283)
> at
> org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1271)
> at
> org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1270)
> at
> scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
> at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
> at
> org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1270)
> at
> org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:697)
> at
> org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:697)
> at scala.Option.foreach(Option.scala:236)
> at
> org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:697)
> at
> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1496)
> at
> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1458)
> at
> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1447)
> at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
> at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:567)
> at org.apache.spark.SparkContext.runJob(SparkContext.scala:1822)
> at org.apache.spark.SparkContext.runJob(SparkContext.scala:1835)
> at org.apache.spark.SparkContext.runJob(SparkContext.scala:1848)
> at org.apache.spark.SparkContext.runJob(SparkContext.scala:1919)
> at org.apache.spark.rdd.RDD.count(RDD.scala:1121)
> at org.apache.spark.ml.recommendation.ALS$.train(ALS.scala:550)
> at org.apache.spark.mllib.recommendation.ALS.run(ALS.scala:239)
> at org.apache.spark.mllib.recommendation.ALS$.train(ALS.scala:328)
> at org.apache.spark.mllib.recommendation.ALS$.train(ALS.scala:346)
> at
> MovieLensALS$$anonfun$main$1$$anonfun$apply$mcVI$sp$1$$anonfun$apply$mcVD$sp$1.apply$mcVI$sp(MovieLensALS.scala:101)
> at
> MovieLensALS$$anonfun$main$1$$anonfun$apply$mcVI$sp$1$$anonfun$apply$mcVD$sp$1.apply(MovieLensALS.scala:100)
> at
> MovieLensALS$$anonfun$main$1$$anonfun$apply$mcVI$sp$1$$anonfun$apply$mcVD$sp$1.apply(MovieLensALS.scala:100)
> at scala.collection.immutable.List.foreach(List.scala:318)
> at
> MovieLensALS$$anonfun$main$1$$anonfun$apply$mcVI$sp$1.apply$mcVD$sp(MovieLensALS.scala:100)
> at
> MovieLensALS$$anonfun$main$1$$anonfun$apply$mcVI$sp$1.apply(MovieLensALS.scala:100)
> at
> MovieLensALS$$anonfun$main$1$$anonfun$apply$mcVI$sp$1.apply(MovieLensALS.scala:100)
> at scala.collection.immutable.List.foreach(List.scala:318)
> at MovieLensALS$$anonfun$main$1.apply$mcVI$sp(MovieLensALS.scala:100)
> at MovieLensALS$$anonfun$main$1.apply(MovieLensALS.scala:100)
> at MovieLensALS$$anonfun$main$1.apply(MovieLensALS.scala:100)
> at scala.collection.immutable.List.foreach(List.scala:318)
> at MovieLensALS$.main(MovieLensALS.scala:100)
> at MovieLensALS.main(MovieLensALS.scala)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at
> org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:672)
> at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:180)
> at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:205)
> at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:120)
> at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
> Caused by: java.lang.OutOfMemoryError: Java heap space
> at
> scala.collection.mutable.ArrayBuilder$ofFloat.mkArray(ArrayBuilder.scala:448)
> at
> scala.collection.mutable.ArrayBuilder$ofFloat.result(ArrayBuilder.scala:493)
> at
> scala.collection.mutable.ArrayBuilder$ofFloat.result(ArrayBuilder.scala:441)
> at
> org.apache.spark.ml.recommendation.ALS$UncompressedInBlockBuilder.build(ALS.scala:859)
> at org.apache.spark.ml.recommendation.ALS$$anonfun$15.apply(ALS.scala:1068)
> at org.apache.spark.ml.recommendation.ALS$$anonfun$15.apply(ALS.scala:1062)
> at
> org.apache.spark.rdd.PairRDDFunctions$$anonfun$mapValues$1$$anonfun$apply$41$$anonfun$apply$42.apply(PairRDDFunctions.scala:700)
> at
> org.apache.spark.rdd.PairRDDFunctions$$anonfun$mapValues$1$$anonfun$apply$41$$anonfun$apply$42.apply(PairRDDFunctions.scala:700)
> at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
> at org.apache.spark.storage.MemoryStore.unrollSafely(MemoryStore.scala:278)
> at org.apache.spark.CacheManager.putInBlockManager(CacheManager.scala:171)
> at org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:78)
> at org.apache.spark.rdd.RDD.iterator(RDD.scala:262)
> at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
> at org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:69)
> at org.apache.spark.rdd.RDD.iterator(RDD.scala:262)
> at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
> at org.apache.spark.scheduler.Task.run(Task.scala:88)
> at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> 16/02/29 19:39:03 WARN QueuedThreadPool: 10 threads could not be stopped
> 16/02/29 19:39:04 WARN QueuedThreadPool: 7 threads could not be stopped
>

Reply via email to