I am sorry for the formatting error, the value for *yarn.scheduler.maximum-allocation-mb = 28G*
On Thu, Jan 15, 2015 at 11:31 AM, Nitin kak <nitinkak...@gmail.com> wrote: > Thanks for sticking to this thread. > > I am guessing what memory my app requests and what Yarn requests on my > part should be same and is determined by the value of *--executor-memory* > which I had set to *20G*. Or can the two values be different? > > I checked in Yarn configurations(below), so I think that fits well into > the memory overhead limits. > > > Container Memory Maximum > yarn.scheduler.maximum-allocation-mb > MiBGiB > Reset to the default value: 64 GiB > <http://10.1.1.49:7180/cmf/services/108/config#> > Override Instances > <http://10.1.1.49:7180/cmf/service/108/roleType/RESOURCEMANAGER/group/yarn-RESOURCEMANAGER-BASE/config/yarn_scheduler_maximum_allocation_mb?wizardMode=false&returnUrl=%2Fcmf%2Fservices%2F108%2Fconfig&filterValue=> > > The largest amount of physical memory, in MiB, that can be requested for a > container. > > > > > > On Thu, Jan 15, 2015 at 10:28 AM, Sean Owen <so...@cloudera.com> wrote: > >> Those settings aren't relevant, I think. You're concerned with what >> your app requests, and what Spark requests of YARN on your behalf. (Of >> course, you can't request more than what your cluster allows for a >> YARN container for example, but that doesn't seem to be what is >> happening here.) >> >> You do not want to omit --executor-memory if you need large executor >> memory heaps, since then you just request the default and that is >> evidently not enough memory for your app. >> >> Look at http://spark.apache.org/docs/latest/running-on-yarn.html and >> spark.yarn.executor.memoryOverhead By default it's 7% of your 20G or >> about 1.4G. You might set this higher to 2G to give more overhead. >> >> See the --config property=value syntax documented in >> http://spark.apache.org/docs/latest/submitting-applications.html >> >> On Thu, Jan 15, 2015 at 3:47 AM, Nitin kak <nitinkak...@gmail.com> wrote: >> > Thanks Sean. >> > >> > I guess Cloudera Manager has parameters executor_total_max_heapsize and >> > worker_max_heapsize which point to the parameters you mentioned above. >> > >> > How much should that cushon between the jvm heap size and yarn memory >> limit >> > be? >> > >> > I tried setting jvm memory to 20g and yarn to 24g, but it gave the same >> > error as above. >> > >> > Then, I removed the "--executor-memory" clause >> > >> > spark-submit --class ConnectedComponentsTest --master yarn-cluster >> > --num-executors 7 --executor-cores 1 >> > target/scala-2.10/connectedcomponentstest_2.10-1.0.jar >> > >> > That is not giving GC, Out of memory exception >> > >> > 15/01/14 21:20:33 WARN channel.DefaultChannelPipeline: An exception was >> > thrown by a user handler while handling an exception event ([id: >> 0x362d65d4, >> > /10.1.1.33:35463 => /10.1.1.73:43389] EXCEPTION: >> java.lang.OutOfMemoryError: >> > GC overhead limit exceeded) >> > java.lang.OutOfMemoryError: GC overhead limit exceeded >> > at java.lang.Object.clone(Native Method) >> > at akka.util.CompactByteString$.apply(ByteString.scala:410) >> > at akka.util.ByteString$.apply(ByteString.scala:22) >> > at >> > >> akka.remote.transport.netty.TcpHandlers$class.onMessage(TcpSupport.scala:45) >> > at >> > >> akka.remote.transport.netty.TcpServerHandler.onMessage(TcpSupport.scala:57) >> > at >> > >> akka.remote.transport.netty.NettyServerHelpers$class.messageReceived(NettyHelpers.scala:43) >> > at >> > >> akka.remote.transport.netty.ServerHandler.messageReceived(NettyTransport.scala:179) >> > at >> org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:296) >> > at >> > >> org.jboss.netty.handler.codec.frame.FrameDecoder.unfoldAndFireMessageReceived(FrameDecoder.java:462) >> > at >> > >> org.jboss.netty.handler.codec.frame.FrameDecoder.callDecode(FrameDecoder.java:443) >> > at >> > >> org.jboss.netty.handler.codec.frame.FrameDecoder.messageReceived(FrameDecoder.java:303) >> > at >> org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:268) >> > at >> org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:255) >> > at >> org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88) >> > at >> > >> org.jboss.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:109) >> > at >> > >> org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:312) >> > at >> > >> org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:90) >> > at >> org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178) >> > at >> > >> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) >> > at >> > >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) >> > at java.lang.Thread.run(Thread.java:745) >> > 15/01/14 21:20:33 ERROR util.Utils: Uncaught exception in thread >> > SparkListenerBus >> > java.lang.OutOfMemoryError: GC overhead limit exceeded >> > at >> scala.collection.mutable.ListBuffer.$plus$eq(ListBuffer.scala:168) >> > at >> scala.collection.mutable.ListBuffer.$plus$eq(ListBuffer.scala:45) >> > at >> > >> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244) >> > at >> > >> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244) >> > at scala.collection.immutable.List.foreach(List.scala:318) >> > at >> scala.collection.TraversableLike$class.map(TraversableLike.scala:244) >> > at scala.collection.AbstractTraversable.map(Traversable.scala:105) >> > at org.json4s.JsonDSL$class.seq2jvalue(JsonDSL.scala:68) >> > at org.json4s.JsonDSL$.seq2jvalue(JsonDSL.scala:61) >> > at >> > >> org.apache.spark.util.JsonProtocol$$anonfun$jobStartToJson$3.apply(JsonProtocol.scala:127) >> > at >> > >> org.apache.spark.util.JsonProtocol$$anonfun$jobStartToJson$3.apply(JsonProtocol.scala:127) >> > at org.json4s.JsonDSL$class.pair2jvalue(JsonDSL.scala:79) >> > at org.json4s.JsonDSL$.pair2jvalue(JsonDSL.scala:61) >> > at >> > >> org.apache.spark.util.JsonProtocol$.jobStartToJson(JsonProtocol.scala:127) >> > at >> > >> org.apache.spark.util.JsonProtocol$.sparkEventToJson(JsonProtocol.scala:59) >> > at >> > >> org.apache.spark.scheduler.EventLoggingListener.logEvent(EventLoggingListener.scala:92) >> > at >> > >> org.apache.spark.scheduler.EventLoggingListener.onJobStart(EventLoggingListener.scala:118) >> > at >> > >> org.apache.spark.scheduler.SparkListenerBus$$anonfun$postToAll$3.apply(SparkListenerBus.scala:50) >> > at >> > >> org.apache.spark.scheduler.SparkListenerBus$$anonfun$postToAll$3.apply(SparkListenerBus.scala:50) >> > at >> > >> org.apache.spark.scheduler.SparkListenerBus$$anonfun$foreachListener$1.apply(SparkListenerBus.scala:83) >> > at >> > >> org.apache.spark.scheduler.SparkListenerBus$$anonfun$foreachListener$1.apply(SparkListenerBus.scala:81) >> > at >> > >> scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) >> > at >> scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47) >> > at >> > >> org.apache.spark.scheduler.SparkListenerBus$class.foreachListener(SparkListenerBus.scala:81) >> > at >> > >> org.apache.spark.scheduler.SparkListenerBus$class.postToAll(SparkListenerBus.scala:50) >> > at >> > >> org.apache.spark.scheduler.LiveListenerBus.postToAll(LiveListenerBus.scala:32) >> > at >> > >> org.apache.spark.scheduler.LiveListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(LiveListenerBus.scala:56) >> > at >> > >> org.apache.spark.scheduler.LiveListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(LiveListenerBus.scala:56) >> > at scala.Option.foreach(Option.scala:236) >> > at >> > >> org.apache.spark.scheduler.LiveListenerBus$$anon$1$$anonfun$run$1.apply$mcV$sp(LiveListenerBus.scala:56) >> > at >> > >> org.apache.spark.scheduler.LiveListenerBus$$anon$1$$anonfun$run$1.apply(LiveListenerBus.scala:47) >> > at >> > >> org.apache.spark.scheduler.LiveListenerBus$$anon$1$$anonfun$run$1.apply(LiveListenerBus.scala:47) >> > Exception in thread "SparkListenerBus" java.lang.OutOfMemoryError: GC >> > overhead limit exceeded >> > at >> scala.collection.mutable.ListBuffer.$plus$eq(ListBuffer.scala:168) >> > at >> scala.collection.mutable.ListBuffer.$plus$eq(ListBuffer.scala:45) >> > at >> > >> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244) >> > at >> > >> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244) >> > at scala.collection.immutable.List.foreach(List.scala:318) >> > at >> scala.collection.TraversableLike$class.map(TraversableLike.scala:244) >> > at scala.collection.AbstractTraversable.map(Traversable.scala:105) >> > at org.json4s.JsonDSL$class.seq2jvalue(JsonDSL.scala:68) >> > at org.json4s.JsonDSL$.seq2jvalue(JsonDSL.scala:61) >> > at >> > >> org.apache.spark.util.JsonProtocol$$anonfun$jobStartToJson$3.apply(JsonProtocol.scala:127) >> > at >> > >> org.apache.spark.util.JsonProtocol$$anonfun$jobStartToJson$3.apply(JsonProtocol.scala:127) >> > at org.json4s.JsonDSL$class.pair2jvalue(JsonDSL.scala:79) >> > at org.json4s.JsonDSL$.pair2jvalue(JsonDSL.scala:61) >> > at >> > >> org.apache.spark.util.JsonProtocol$.jobStartToJson(JsonProtocol.scala:127) >> > at >> > >> org.apache.spark.util.JsonProtocol$.sparkEventToJson(JsonProtocol.scala:59) >> > at >> > >> org.apache.spark.scheduler.EventLoggingListener.logEvent(EventLoggingListener.scala:92) >> > at >> > >> org.apache.spark.scheduler.EventLoggingListener.onJobStart(EventLoggingListener.scala:118) >> > at >> > >> org.apache.spark.scheduler.SparkListenerBus$$anonfun$postToAll$3.apply(SparkListenerBus.scala:50) >> > at >> > >> org.apache.spark.scheduler.SparkListenerBus$$anonfun$postToAll$3.apply(SparkListenerBus.scala:50) >> > at >> > >> org.apache.spark.scheduler.SparkListenerBus$$anonfun$foreachListener$1.apply(SparkListenerBus.scala:83) >> > at >> > >> org.apache.spark.scheduler.SparkListenerBus$$anonfun$foreachListener$1.apply(SparkListenerBus.scala:81) >> > at >> > >> scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) >> > at >> scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47) >> > at >> > >> org.apache.spark.scheduler.SparkListenerBus$class.foreachListener(SparkListenerBus.scala:81) >> > at >> > >> org.apache.spark.scheduler.SparkListenerBus$class.postToAll(SparkListenerBus.scala:50) >> > at >> > >> org.apache.spark.scheduler.LiveListenerBus.postToAll(LiveListenerBus.scala:32) >> > at >> > >> org.apache.spark.scheduler.LiveListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(LiveListenerBus.scala:56) >> > at >> > >> org.apache.spark.scheduler.LiveListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(LiveListenerBus.scala:56) >> > at scala.Option.foreach(Option.scala:236) >> > at >> > >> org.apache.spark.scheduler.LiveListenerBus$$anon$1$$anonfun$run$1.apply$mcV$sp(LiveListenerBus.scala:56) >> > at >> > >> org.apache.spark.scheduler.LiveListenerBus$$anon$1$$anonfun$run$1.apply(LiveListenerBus.scala:47) >> > at >> > >> org.apache.spark.scheduler.LiveListenerBus$$anon$1$$anonfun$run$1.apply(LiveListenerBus.scala:47) >> > >> > >> > On Wed, Jan 14, 2015 at 4:44 PM, Sean Owen <so...@cloudera.com> wrote: >> >> >> >> That's not quite what that error means. Spark is not out of memory. It >> >> means that Spark is using more memory than it asked YARN for. That in >> >> turn is because the default amount of cushion established between the >> >> YARN allowed container size and the JVM heap size is too small. See >> >> spark.yarn.executor.memoryOverhead in >> >> http://spark.apache.org/docs/latest/running-on-yarn.html >> >> >> >> On Wed, Jan 14, 2015 at 9:18 PM, nitinkak001 <nitinkak...@gmail.com> >> >> wrote: >> >> > I am trying to run connected components algorithm in Spark. The graph >> >> > has >> >> > roughly 28M edges and 3.2M vertices. Here is the code I am using >> >> > >> >> > /val inputFile = >> >> > >> "/user/hive/warehouse/spark_poc.db/window_compare_output_text/000000_0" >> >> > val conf = new SparkConf().setAppName("ConnectedComponentsTest") >> >> > val sc = new SparkContext(conf) >> >> > val graph = GraphLoader.edgeListFile(sc, inputFile, true, 7, >> >> > StorageLevel.MEMORY_AND_DISK, StorageLevel.MEMORY_AND_DISK); >> >> > graph.cache(); >> >> > val cc = graph.connectedComponents(); >> >> > graph.edges.saveAsTextFile("/user/kakn/output");/ >> >> > >> >> > and here is the command: >> >> > >> >> > /spark-submit --class ConnectedComponentsTest --master yarn-cluster >> >> > --num-executors 7 --driver-memory 6g --executor-memory 8g >> >> > --executor-cores 1 >> >> > target/scala-2.10/connectedcomponentstest_2.10-1.0.jar/ >> >> > >> >> > It runs for about an hour and then fails with below error. *Isnt >> Spark >> >> > supposed to spill on disk if the RDDs dont fit into the memory?* >> >> > >> >> > Application application_1418082773407_8587 failed 2 times due to AM >> >> > Container for appattempt_1418082773407_8587_000002 exited with >> exitCode: >> >> > -104 due to: Container >> >> > [pid=19790,containerID=container_1418082773407_8587_02_000001] is >> >> > running >> >> > beyond physical memory limits. Current usage: 6.5 GB of 6.5 GB >> physical >> >> > memory used; 8.9 GB of 13.6 GB virtual memory used. Killing >> container. >> >> > Dump of the process-tree for container_1418082773407_8587_02_000001 : >> >> > |- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) >> >> > SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) >> FULL_CMD_LINE >> >> > |- 19790 19788 19790 19790 (bash) 0 0 110809088 336 /bin/bash -c >> >> > /usr/java/jdk1.7.0_67-cloudera/bin/java -server -Xmx6144m >> >> > >> >> > >> -Djava.io.tmpdir=/mnt/DATA1/yarn/nm/usercache/kakn/appcache/application_1418082773407_8587/container_1418082773407_8587_02_000001/tmp >> >> > '-Dspark.executor.memory=8g' '-Dspark.eventLog.enabled=true' >> >> > '-Dspark.yarn.secondary.jars=' >> >> > '-Dspark.app.name=ConnectedComponentsTest' >> >> > >> >> > >> '-Dspark.eventLog.dir=hdfs://<server-name-replaced>:8020/user/spark/applicationHistory' >> >> > '-Dspark.master=yarn-cluster' >> >> > org.apache.spark.deploy.yarn.ApplicationMaster >> >> > --class 'ConnectedComponentsTest' --jar >> >> > >> >> > >> 'file:/home/kakn01/Spark/SparkSource/target/scala-2.10/connectedcomponentstest_2.10-1.0.jar' >> >> > --executor-memory 8192 --executor-cores 1 --num-executors 7 1> >> >> > >> >> > >> /var/log/hadoop-yarn/container/application_1418082773407_8587/container_1418082773407_8587_02_000001/stdout >> >> > 2> >> >> > >> >> > >> /var/log/hadoop-yarn/container/application_1418082773407_8587/container_1418082773407_8587_02_000001/stderr >> >> > |- 19794 19790 19790 19790 (java) 205066 9152 9477726208 1707599 >> >> > /usr/java/jdk1.7.0_67-cloudera/bin/java -server -Xmx6144m >> >> > >> >> > >> -Djava.io.tmpdir=/mnt/DATA1/yarn/nm/usercache/kakn/appcache/application_1418082773407_8587/container_1418082773407_8587_02_000001/tmp >> >> > -Dspark.executor.memory=8g -Dspark.eventLog.enabled=true >> >> > -Dspark.yarn.secondary.jars= -Dspark.app.name >> =ConnectedComponentsTest >> >> > >> >> > >> -Dspark.eventLog.dir=hdfs://<server-name-replaced>:8020/user/spark/applicationHistory >> >> > -Dspark.master=yarn-cluster >> >> > org.apache.spark.deploy.yarn.ApplicationMaster >> >> > --class ConnectedComponentsTest --jar >> >> > >> >> > >> file:/home/kakn01/Spark/SparkSource/target/scala-2.10/connectedcomponentstest_2.10-1.0.jar >> >> > --executor-memory 8192 --executor-cores 1 --num-executors 7 >> >> > Container killed on request. Exit code is 143 >> >> > Container exited with a non-zero exit code 143 >> >> > .Failing this attempt.. Failing the application. >> >> > >> >> > >> >> > >> >> > -- >> >> > View this message in context: >> >> > >> http://apache-spark-user-list.1001560.n3.nabble.com/Running-beyond-memory-limits-in-ConnectedComponents-tp21139.html >> >> > Sent from the Apache Spark User List mailing list archive at >> Nabble.com. >> >> > >> >> > --------------------------------------------------------------------- >> >> > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org >> >> > For additional commands, e-mail: user-h...@spark.apache.org >> >> > >> > >> > >> > >