I have yarn configured with yarn.nodemanager.vmem-check-enabled=false and
yarn.nodemanager.pmem-check-enabled=false to avoid yarn killing the containers.
the stack trace is bellow.
thanks,Antony.
15/01/27 17:02:53 ERROR executor.CoarseGrainedExecutorBackend: RECEIVED SIGNAL
15: SIGTERM15/01/27 17:02:53 ERROR executor.Executor: Exception in task 21.0 in
stage 12.0 (TID 1312)java.lang.OutOfMemoryError: GC overhead limit exceeded
at java.lang.Integer.valueOf(Integer.java:642) at
scala.runtime.BoxesRunTime.boxToInteger(BoxesRunTime.java:70) at
scala.collection.mutable.ArrayOps$ofInt.apply(ArrayOps.scala:156) at
scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
at scala.collection.mutable.ArrayOps$ofInt.foreach(ArrayOps.scala:156)
at scala.collection.SeqLike$class.distinct(SeqLike.scala:493) at
scala.collection.mutable.ArrayOps$ofInt.distinct(ArrayOps.scala:156) at
org.apache.spark.mllib.recommendation.ALS.org$apache$spark$mllib$recommendation$ALS$$makeOutLinkBlock(ALS.scala:404)
at
org.apache.spark.mllib.recommendation.ALS$$anonfun$15.apply(ALS.scala:459)
at org.apache.spark.mllib.recommendation.ALS$$anonfun$15.apply(ALS.scala:456)
at org.apache.spark.rdd.RDD$$anonfun$14.apply(RDD.scala:614) at
org.apache.spark.rdd.RDD$$anonfun$14.apply(RDD.scala:614) at
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:263) at
org.apache.spark.rdd.RDD.iterator(RDD.scala:230) at
org.apache.spark.rdd.MappedValuesRDD.compute(MappedValuesRDD.scala:31)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:263) at
org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:61) at
org.apache.spark.rdd.RDD.iterator(RDD.scala:228) at
org.apache.spark.rdd.CoGroupedRDD$$anonfun$compute$2.apply(CoGroupedRDD.scala:130)
at
org.apache.spark.rdd.CoGroupedRDD$$anonfun$compute$2.apply(CoGroupedRDD.scala:127)
at
scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:772)
at
scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:108)
at
scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:771)
at org.apache.spark.rdd.CoGroupedRDD.compute(CoGroupedRDD.scala:127)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:263) at
org.apache.spark.rdd.RDD.iterator(RDD.scala:230) at
org.apache.spark.rdd.MappedValuesRDD.compute(MappedValuesRDD.scala:31)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:263) at
org.apache.spark.rdd.RDD.iterator(RDD.scala:230) at
org.apache.spark.rdd.FlatMappedValuesRDD.compute(FlatMappedValuesRDD.scala:31)15/01/27
17:02:53 ERROR util.SparkUncaughtExceptionHandler: Uncaught exception in
thread Thread[Executor task launch worker-8,5,main]java.lang.OutOfMemoryError:
GC overhead limit exceeded at
java.lang.Integer.valueOf(Integer.java:642) at
scala.runtime.BoxesRunTime.boxToInteger(BoxesRunTime.java:70) at
scala.collection.mutable.ArrayOps$ofInt.apply(ArrayOps.scala:156) at
scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
at scala.collection.mutable.ArrayOps$ofInt.foreach(ArrayOps.scala:156)
at scala.collection.SeqLike$class.distinct(SeqLike.scala:493) at
scala.collection.mutable.ArrayOps$ofInt.distinct(ArrayOps.scala:156) at
org.apache.spark.mllib.recommendation.ALS.org$apache$spark$mllib$recommendation$ALS$$makeOutLinkBlock(ALS.scala:404)
at
org.apache.spark.mllib.recommendation.ALS$$anonfun$15.apply(ALS.scala:459)
at org.apache.spark.mllib.recommendation.ALS$$anonfun$15.apply(ALS.scala:456)
at org.apache.spark.rdd.RDD$$anonfun$14.apply(RDD.scala:614) at
org.apache.spark.rdd.RDD$$anonfun$14.apply(RDD.scala:614) at
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:263) at
org.apache.spark.rdd.RDD.iterator(RDD.scala:230) at
org.apache.spark.rdd.MappedValuesRDD.compute(MappedValuesRDD.scala:31)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:263) at
org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:61) at
org.apache.spark.rdd.RDD.iterator(RDD.scala:228) at
org.apache.spark.rdd.CoGroupedRDD$$anonfun$compute$2.apply(CoGroupedRDD.scala:130)
at
org.apache.spark.rdd.CoGroupedRDD$$anonfun$compute$2.apply(CoGroupedRDD.scala:127)
at
scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:772)
at
scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:108)
at
scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:771)
at org.apache.spark.rdd.CoGroupedRDD.compute(CoGroupedRDD.scala:127)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:263) at
org.apache.spark.rdd.RDD.iterator(RDD.scala:230) at
org.apache.spark.rdd.MappedValuesRDD.compute(MappedValuesRDD.scala:31)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:263) at
org.apache.spark.rdd.RDD.iterator(RDD.scala:230) at
org.apache.spark.rdd.FlatMappedValuesRDD.compute(FlatMappedValuesRDD.scala:31)
On Wednesday, 28 January 2015, 0:01, Guru Medasani <[email protected]>
wrote:
Can you attach the logs where this is failing?
From: Sven Krasser <[email protected]>
Date: Tuesday, January 27, 2015 at 4:50 PM
To: Guru Medasani <[email protected]>
Cc: Sandy Ryza <[email protected]>, Antony Mayi <[email protected]>,
"[email protected]" <[email protected]>
Subject: Re: java.lang.OutOfMemoryError: GC overhead limit exceeded
Since it's an executor running OOM it doesn't look like a container being
killed by YARN to me. As a starting point, can you repartition your job into
smaller tasks?
-Sven
On Tue, Jan 27, 2015 at 2:34 PM, Guru Medasani <[email protected]> wrote:
Hi Anthony,
What is the setting of the total amount of memory in MB that can be allocated
to containers on your NodeManagers?
yarn.nodemanager.resource.memory-mb
Can you check this above configuration in yarn-site.xml used by the node
manager process?
-Guru Medasani
From: Sandy Ryza <[email protected]>
Date: Tuesday, January 27, 2015 at 3:33 PM
To: Antony Mayi <[email protected]>
Cc: "[email protected]" <[email protected]>
Subject: Re: java.lang.OutOfMemoryError: GC overhead limit exceeded
Hi Antony,
If you look in the YARN NodeManager logs, do you see that it's killing the
executors? Or are they crashing for a different reason?
-Sandy
On Tue, Jan 27, 2015 at 12:43 PM, Antony Mayi <[email protected]>
wrote:
Hi,
I am using spark.yarn.executor.memoryOverhead=8192 yet getting executors
crashed with this error.
does that mean I have genuinely not enough RAM or is this matter of config
tuning?
other config options used:spark.storage.memoryFraction=0.3
SPARK_EXECUTOR_MEMORY=14G
running spark 1.2.0 as yarn-client on cluster of 10 nodes (the workload is ALS
trainImplicit on ~15GB dataset)
thanks for any ideas,Antony.
--
http://sites.google.com/site/krasser/?utm_source=sig