Hi Sandy, I appreciate your clear explanation. Let me try again. It's the best way to confirm I understand.
spark.executor.memory + spark.yarn.executor.memoryOverhead = the memory that YARN will create a JVM spark.executor.memory = the memory I can actually use in my jvm application = part of it (spark.storage.memoryFraction) is reserved for caching + part of it (spark.shuffle.memoryFraction) is reserved for shuffling + the remaining is for bookkeeping & UDFs If I am correct above, then one implication from them is: (spark.executor.memory + spark.yarn.executor.memoryOverhead) * number of executors per machine should be configured smaller than a single machine physical memory Right? Again, thanks! Kelvin On Fri, Feb 20, 2015 at 11:50 AM, Sandy Ryza <sandy.r...@cloudera.com> wrote: > Hi Kelvin, > > spark.executor.memory controls the size of the executor heaps. > > spark.yarn.executor.memoryOverhead is the amount of memory to request from > YARN beyond the heap size. This accounts for the fact that JVMs use some > non-heap memory. > > The Spark heap is divided into spark.storage.memoryFraction (default 0.6) > and spark.shuffle.memoryFraction (default 0.2), and the rest is for basic > Spark bookkeeping and anything the user does inside UDFs. > > -Sandy > > > > On Fri, Feb 20, 2015 at 11:44 AM, Kelvin Chu <2dot7kel...@gmail.com> > wrote: > >> Hi Sandy, >> >> I am also doing memory tuning on YARN. Just want to confirm, is it >> correct to say: >> >> spark.executor.memory - spark.yarn.executor.memoryOverhead = the memory >> I can actually use in my jvm application >> >> If it is not, what is the correct relationship? Any other variables or >> config parameters in play? Thanks. >> >> Kelvin >> >> On Fri, Feb 20, 2015 at 9:45 AM, Sandy Ryza <sandy.r...@cloudera.com> >> wrote: >> >>> If that's the error you're hitting, the fix is to boost >>> spark.yarn.executor.memoryOverhead, which will put some extra room in >>> between the executor heap sizes and the amount of memory requested for them >>> from YARN. >>> >>> -Sandy >>> >>> On Fri, Feb 20, 2015 at 9:40 AM, lbierman <leebier...@gmail.com> wrote: >>> >>>> A bit more context on this issue. From the container logs on the >>>> executor >>>> >>>> Given my cluster specs above what would be appropriate parameters to >>>> pass >>>> into : >>>> --num-executors --num-cores --executor-memory >>>> >>>> I had tried it with --executor-memory 2500MB >>>> >>>> 015-02-20 06:50:09,056 WARN >>>> >>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: >>>> Container >>>> [pid=23320,containerID=container_1423083596644_0238_01_004160] is >>>> running beyond physical memory limits. Current usage: 2.8 GB of 2.7 GB >>>> physical memory used; 4.4 GB of 5.8 GB virtual memory used. Killing >>>> container. >>>> Dump of the process-tree for container_1423083596644_0238_01_004160 : >>>> |- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) >>>> SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE >>>> |- 23320 23318 23320 23320 (bash) 0 0 108650496 305 /bin/bash -c >>>> /usr/java/latest/bin/java -server -XX:OnOutOfMemoryError='kill %p' >>>> -Xms2400m >>>> -Xmx2400m >>>> >>>> -Djava.io.tmpdir=/dfs/yarn/nm/usercache/root/appcache/application_1423083596644_0238/container_1423083596644_0238_01_004160/tmp >>>> >>>> -Dspark.yarn.app.container.log.dir=/var/log/hadoop-yarn/container/application_1423083596644_0238/container_1423083596644_0238_01_004160 >>>> org.apache.spark.executor.CoarseGrainedExecutorBackend >>>> akka.tcp://sparkDriver@ip-10-168-86-13.ec2.internal >>>> :42535/user/CoarseGrainedScheduler >>>> 8 ip-10-99-162-56.ec2.internal 1 application_1423083596644_0238 1> >>>> >>>> /var/log/hadoop-yarn/container/application_1423083596644_0238/container_1423083596644_0238_01_004160/stdout >>>> 2> >>>> >>>> /var/log/hadoop-yarn/container/application_1423083596644_0238/container_1423083596644_0238_01_004160/stderr >>>> |- 23323 23320 23320 23320 (java) 922271 12263 4612222976 724218 >>>> /usr/java/latest/bin/java -server -XX:OnOutOfMemoryError=kill %p >>>> -Xms2400m >>>> -Xmx2400m >>>> >>>> -Djava.io.tmpdir=/dfs/yarn/nm/usercache/root/appcache/application_1423083596644_0238/container_1423083596644_0238_01_004160/tmp >>>> >>>> -Dspark.yarn.app.container.log.dir=/var/log/hadoop-yarn/container/application_1423083596644_0238/container_1423083596644_0238_01_004160 >>>> org.apache.spark.executor.CoarseGrainedExecutorBackend >>>> akka.tcp://sparkDriver@ip-10-168-86-13.ec2.internal:42535/user/Coarse >>>> >>>> >>>> >>>> >>>> -- >>>> View this message in context: >>>> http://apache-spark-user-list.1001560.n3.nabble.com/Spark-Performance-on-Yarn-tp21729p21739.html >>>> Sent from the Apache Spark User List mailing list archive at Nabble.com. >>>> >>>> --------------------------------------------------------------------- >>>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org >>>> For additional commands, e-mail: user-h...@spark.apache.org >>>> >>>> >>> >> >