I repartitioned input RDD from 4,800 to 24,000 partitions After that the stage (24000 tasks) was done in 22 min on 100 boxes Shuffle read/write: 905 GB / 710 GB
Task Metrics (Dur/GC/Read/Write) Min: 7s/1s/38MB/30MB Med: 22s/9s/38MB/30MB Max:1.8min/1.6min/38MB/30MB On Mon, Sep 21, 2015 at 5:55 PM, Sandy Ryza <sandy.r...@cloudera.com> wrote: > The warning your seeing in Spark is no issue. The scratch space lives > inside the heap, so it'll never result in YARN killing the container by > itself. The issue is that Spark is using some off-heap space on top of > that. > > You'll need to bump the spark.yarn.executor.memoryOverhead property to > give the executors some additional headroom above the heap space. > > -Sandy > > On Mon, Sep 21, 2015 at 5:43 PM, Saisai Shao <sai.sai.s...@gmail.com> > wrote: > >> I think you need to increase the memory size of executor through command >> arguments "--executor-memory", or configuration "spark.executor.memory". >> >> Also yarn.scheduler.maximum-allocation-mb in Yarn side if necessary. >> >> Thanks >> Saisai >> >> >> On Mon, Sep 21, 2015 at 5:13 PM, Alexander Pivovarov < >> apivova...@gmail.com> wrote: >> >>> I noticed that some executors have issue with scratch space. >>> I see the following in yarn app container stderr around the time when >>> yarn killed the executor because it uses too much memory. >>> >>> -- App container stderr -- >>> 15/09/21 21:43:22 WARN storage.MemoryStore: Not enough space to cache >>> rdd_6_346 in memory! (computed 3.0 GB so far) >>> 15/09/21 21:43:22 INFO storage.MemoryStore: Memory use = 477.6 KB >>> (blocks) + 24.4 GB (scratch space shared across 8 thread(s)) = 24.4 GB. >>> Storage limit = 25.2 GB. >>> 15/09/21 21:43:22 WARN spark.CacheManager: Persisting partition >>> rdd_6_346 to disk instead. >>> 15/09/21 21:43:22 WARN storage.MemoryStore: Not enough space to cache >>> rdd_6_49 in memory! (computed 3.1 GB so far) >>> 15/09/21 21:43:22 INFO storage.MemoryStore: Memory use = 477.6 KB >>> (blocks) + 24.4 GB (scratch space shared across 8 thread(s)) = 24.4 GB. >>> Storage limit = 25.2 GB. >>> 15/09/21 21:43:22 WARN spark.CacheManager: Persisting partition rdd_6_49 >>> to disk instead. >>> >>> -- Yarn Nodemanager log -- >>> 2015-09-21 21:44:05,716 WARN >>> org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl >>> (Container Monitor): Container >>> [pid=5114,containerID=container_1442869100946_0001_01_0 >>> 00056] is running beyond physical memory limits. Current usage: 52.2 GB >>> of 52 GB physical memory used; 53.0 GB of 260 GB virtual memory used. >>> Killing container. >>> Dump of the process-tree for container_1442869100946_0001_01_000056 : >>> |- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) >>> SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE >>> |- 5117 5114 5114 5114 (java) 1322810 27563 56916316160 13682772 >>> /usr/lib/jvm/java-openjdk/bin/java -server -XX:OnOutOfMemoryError=kill %p >>> -Xms47924m -Xmx47924m -verbose:gc -XX:+PrintGCDetails >>> -XX:+PrintGCDateStamps -XX:+UseConcMarkSweepGC >>> -XX:CMSInitiatingOccupancyFraction=70 -XX:MaxHeapFreeRatio=70 >>> -XX:+CMSClassUnloadingEnabled >>> -Djava.io.tmpdir=/mnt/yarn/usercache/hadoop/appcache/application_1442869100946_0001/container_1442869100946_0001_01_000056/tmp >>> -Dspark.akka.failure-detector.threshold=3000.0 >>> -Dspark.akka.heartbeat.interval=10000s -Dspark.akka.threads=4 >>> -Dspark.history.ui.port=18080 -Dspark.akka.heartbeat.pauses=60000s >>> -Dspark.akka.timeout=1000s -Dspark.akka.frameSize=50 >>> -Dspark.driver.port=52690 >>> -Dspark.yarn.app.container.log.dir=/var/log/hadoop-yarn/containers/application_1442869100946_0001/container_1442869100946_0001_01_000056 >>> org.apache.spark.executor.CoarseGrainedExecutorBackend --driver-url >>> akka.tcp://sparkDriver@10.0.24.153:52690/user/CoarseGrainedScheduler >>> --executor-id 55 --hostname ip-10-0-28-96.ec2.internal --cores 8 --app-id >>> application_1442869100946_0001 --user-class-path >>> file:/mnt/yarn/usercache/hadoop/appcache/application_1442869100946_0001/container_1442869100946_0001_01_000056/__app__.jar >>> |- 5114 5112 5114 5114 (bash) 0 0 9658368 291 /bin/bash -c >>> /usr/lib/jvm/java-openjdk/bin/java -server -XX:OnOutOfMemoryError='kill %p' >>> -Xms47924m -Xmx47924m '-verbose:gc' '-XX:+PrintGCDetails' >>> '-XX:+PrintGCDateStamps' '-XX:+UseConcMarkSweepGC' >>> '-XX:CMSInitiatingOccupancyFraction=70' '-XX:MaxHeapFreeRatio=70' >>> '-XX:+CMSClassUnloadingEnabled' >>> -Djava.io.tmpdir=/mnt/yarn/usercache/hadoop/appcache/application_1442869100946_0001/container_1442869100946_0001_01_000056/tmp >>> '-Dspark.akka.failure-detector.threshold=3000.0' >>> '-Dspark.akka.heartbeat.interval=10000s' '-Dspark.akka.threads=4' >>> '-Dspark.history.ui.port=18080' '-Dspark.akka.heartbeat.pauses=60000s' >>> '-Dspark.akka.timeout=1000s' '-Dspark.akka.frameSize=50' >>> '-Dspark.driver.port=52690' >>> -Dspark.yarn.app.container.log.dir=/var/log/hadoop-yarn/containers/application_1442869100946_0001/container_1442869100946_0001_01_000056 >>> org.apache.spark.executor.CoarseGrainedExecutorBackend --driver-url >>> akka.tcp://sparkDriver@10.0.24.153:52690/user/CoarseGrainedScheduler >>> --executor-id 55 --hostname ip-10-0-28-96.ec2.internal --cores 8 --app-id >>> application_1442869100946_0001 --user-class-path >>> file:/mnt/yarn/usercache/hadoop/appcache/application_1442869100946_0001/container_1442869100946_0001_01_000056/__app__.jar >>> 1> >>> /var/log/hadoop-yarn/containers/application_1442869100946_0001/container_1442869100946_0001_01_000056/stdout >>> 2> >>> /var/log/hadoop-yarn/containers/application_1442869100946_0001/container_1442869100946_0001_01_000056/stderr >>> >>> >>> >>> Is it possible to get what scratch space is used for? >>> >>> What spark setting should I try to adjust to solve the issue? >>> >>> On Thu, Sep 10, 2015 at 2:52 PM, Sandy Ryza <sandy.r...@cloudera.com> >>> wrote: >>> >>>> YARN will never kill processes for being unresponsive. >>>> >>>> It may kill processes for occupying more memory than it allows. To get >>>> around this, you can either bump spark.yarn.executor.memoryOverhead or turn >>>> off the memory checks entirely with yarn.nodemanager.pmem-check-enabled. >>>> >>>> -Sandy >>>> >>>> On Tue, Sep 8, 2015 at 10:48 PM, Alexander Pivovarov < >>>> apivova...@gmail.com> wrote: >>>> >>>>> The problem which we have now is skew data (2360 tasks done in 5 min, >>>>> 3 tasks in 40 min and 1 task in 2 hours) >>>>> >>>>> Some people from the team worry that the executor which runs the >>>>> longest task can be killed by YARN (because executor might be unresponsive >>>>> because of GC or it might occupy more memory than Yarn allows) >>>>> >>>>> >>>>> >>>>> On Tue, Sep 8, 2015 at 3:02 PM, Sandy Ryza <sandy.r...@cloudera.com> >>>>> wrote: >>>>> >>>>>> Those settings seem reasonable to me. >>>>>> >>>>>> Are you observing performance that's worse than you would expect? >>>>>> >>>>>> -Sandy >>>>>> >>>>>> On Mon, Sep 7, 2015 at 11:22 AM, Alexander Pivovarov < >>>>>> apivova...@gmail.com> wrote: >>>>>> >>>>>>> Hi Sandy >>>>>>> >>>>>>> Thank you for your reply >>>>>>> Currently we use r3.2xlarge boxes (vCPU: 8, Mem: 61 GiB) >>>>>>> with emr setting for Spark "maximizeResourceAllocation": "true" >>>>>>> >>>>>>> It is automatically converted to Spark settings >>>>>>> spark.executor.memory 47924M >>>>>>> spark.yarn.executor.memoryOverhead 5324 >>>>>>> >>>>>>> we also set spark.default.parallelism = slave_count * 16 >>>>>>> >>>>>>> Does it look good for you? (we run single heavy job on cluster) >>>>>>> >>>>>>> Alex >>>>>>> >>>>>>> On Mon, Sep 7, 2015 at 11:03 AM, Sandy Ryza <sandy.r...@cloudera.com >>>>>>> > wrote: >>>>>>> >>>>>>>> Hi Alex, >>>>>>>> >>>>>>>> If they're both configured correctly, there's no reason that Spark >>>>>>>> Standalone should provide performance or memory improvement over Spark >>>>>>>> on >>>>>>>> YARN. >>>>>>>> >>>>>>>> -Sandy >>>>>>>> >>>>>>>> On Fri, Sep 4, 2015 at 1:24 PM, Alexander Pivovarov < >>>>>>>> apivova...@gmail.com> wrote: >>>>>>>> >>>>>>>>> Hi Everyone >>>>>>>>> >>>>>>>>> We are trying the latest aws emr-4.0.0 and Spark and my question >>>>>>>>> is about YARN vs Standalone mode. >>>>>>>>> Our usecase is >>>>>>>>> - start 100-150 nodes cluster every week, >>>>>>>>> - run one heavy spark job (5-6 hours) >>>>>>>>> - save data to s3 >>>>>>>>> - stop cluster >>>>>>>>> >>>>>>>>> Officially aws emr-4.0.0 comes with Spark on Yarn >>>>>>>>> It's probably possible to hack emr by creating bootstrap script >>>>>>>>> which stops yarn and starts master and slaves on each computer (to >>>>>>>>> start >>>>>>>>> Spark in standalone mode) >>>>>>>>> >>>>>>>>> My questions are >>>>>>>>> - Does Spark standalone provides significant performance / memory >>>>>>>>> improvement in comparison to YARN mode? >>>>>>>>> - Does it worth hacking official emr Spark on Yarn and switch >>>>>>>>> Spark to Standalone mode? >>>>>>>>> >>>>>>>>> >>>>>>>>> I already created comparison table and want you to check if my >>>>>>>>> understanding is correct >>>>>>>>> >>>>>>>>> Lets say r3.2xlarge computer has 52GB ram available for Spark >>>>>>>>> Executor JVMs >>>>>>>>> >>>>>>>>> standalone to yarn comparison >>>>>>>>> >>>>>>>>> >>>>>>>>> STDLN YARN >>>>>>>>> >>>>>>>>> can executor allocate up to 52GB ram - >>>>>>>>> yes | yes >>>>>>>>> >>>>>>>>> will executor be unresponsive after using all 52GB ram because of >>>>>>>>> GC - yes | yes >>>>>>>>> >>>>>>>>> additional JVMs on slave except of spark executor - workr | >>>>>>>>> node mngr >>>>>>>>> >>>>>>>>> are additional JVMs lightweight >>>>>>>>> - yes | yes >>>>>>>>> >>>>>>>>> >>>>>>>>> Thank you >>>>>>>>> >>>>>>>>> Alex >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> >