Re: Spark on Yarn vs Standalone

Alexander Pivovarov Mon, 21 Sep 2015 19:57:00 -0700

I repartitioned input RDD from 4,800 to 24,000 partitions
After that the stage (24000 tasks) was done in 22 min on 100 boxes
Shuffle read/write: 905 GB / 710 GB


Task Metrics (Dur/GC/Read/Write)
Min: 7s/1s/38MB/30MB
Med: 22s/9s/38MB/30MB
Max:1.8min/1.6min/38MB/30MB

On Mon, Sep 21, 2015 at 5:55 PM, Sandy Ryza <sandy.r...@cloudera.com> wrote:

> The warning your seeing in Spark is no issue.  The scratch space lives
> inside the heap, so it'll never result in YARN killing the container by
> itself.  The issue is that Spark is using some off-heap space on top of
> that.
>
> You'll need to bump the spark.yarn.executor.memoryOverhead property to
> give the executors some additional headroom above the heap space.
>
> -Sandy
>
> On Mon, Sep 21, 2015 at 5:43 PM, Saisai Shao <sai.sai.s...@gmail.com>
> wrote:
>
>> I think you need to increase the memory size of executor through command
>> arguments "--executor-memory", or configuration "spark.executor.memory".
>>
>> Also yarn.scheduler.maximum-allocation-mb in Yarn side if necessary.
>>
>> Thanks
>> Saisai
>>
>>
>> On Mon, Sep 21, 2015 at 5:13 PM, Alexander Pivovarov <
>> apivova...@gmail.com> wrote:
>>
>>> I noticed that some executors have issue with scratch space.
>>> I see the following in yarn app container stderr around the time when
>>> yarn killed the executor because it uses too much memory.
>>>
>>> -- App container stderr --
>>> 15/09/21 21:43:22 WARN storage.MemoryStore: Not enough space to cache
>>> rdd_6_346 in memory! (computed 3.0 GB so far)
>>> 15/09/21 21:43:22 INFO storage.MemoryStore: Memory use = 477.6 KB
>>> (blocks) + 24.4 GB (scratch space shared across 8 thread(s)) = 24.4 GB.
>>> Storage limit = 25.2 GB.
>>> 15/09/21 21:43:22 WARN spark.CacheManager: Persisting partition
>>> rdd_6_346 to disk instead.
>>> 15/09/21 21:43:22 WARN storage.MemoryStore: Not enough space to cache
>>> rdd_6_49 in memory! (computed 3.1 GB so far)
>>> 15/09/21 21:43:22 INFO storage.MemoryStore: Memory use = 477.6 KB
>>> (blocks) + 24.4 GB (scratch space shared across 8 thread(s)) = 24.4 GB.
>>> Storage limit = 25.2 GB.
>>> 15/09/21 21:43:22 WARN spark.CacheManager: Persisting partition rdd_6_49
>>> to disk instead.
>>>
>>> -- Yarn Nodemanager log --
>>> 2015-09-21 21:44:05,716 WARN
>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl
>>> (Container Monitor): Container
>>> [pid=5114,containerID=container_1442869100946_0001_01_0
>>> 00056] is running beyond physical memory limits. Current usage: 52.2 GB
>>> of 52 GB physical memory used; 53.0 GB of 260 GB virtual memory used.
>>> Killing container.
>>> Dump of the process-tree for container_1442869100946_0001_01_000056 :
>>>         |- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS)
>>> SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE
>>>         |- 5117 5114 5114 5114 (java) 1322810 27563 56916316160 13682772
>>> /usr/lib/jvm/java-openjdk/bin/java -server -XX:OnOutOfMemoryError=kill %p
>>> -Xms47924m -Xmx47924m -verbose:gc -XX:+PrintGCDetails
>>> -XX:+PrintGCDateStamps -XX:+UseConcMarkSweepGC
>>> -XX:CMSInitiatingOccupancyFraction=70 -XX:MaxHeapFreeRatio=70
>>> -XX:+CMSClassUnloadingEnabled
>>> -Djava.io.tmpdir=/mnt/yarn/usercache/hadoop/appcache/application_1442869100946_0001/container_1442869100946_0001_01_000056/tmp
>>> -Dspark.akka.failure-detector.threshold=3000.0
>>> -Dspark.akka.heartbeat.interval=10000s -Dspark.akka.threads=4
>>> -Dspark.history.ui.port=18080 -Dspark.akka.heartbeat.pauses=60000s
>>> -Dspark.akka.timeout=1000s -Dspark.akka.frameSize=50
>>> -Dspark.driver.port=52690
>>> -Dspark.yarn.app.container.log.dir=/var/log/hadoop-yarn/containers/application_1442869100946_0001/container_1442869100946_0001_01_000056
>>> org.apache.spark.executor.CoarseGrainedExecutorBackend --driver-url
>>> akka.tcp://sparkDriver@10.0.24.153:52690/user/CoarseGrainedScheduler
>>> --executor-id 55 --hostname ip-10-0-28-96.ec2.internal --cores 8 --app-id
>>> application_1442869100946_0001 --user-class-path
>>> file:/mnt/yarn/usercache/hadoop/appcache/application_1442869100946_0001/container_1442869100946_0001_01_000056/__app__.jar
>>>         |- 5114 5112 5114 5114 (bash) 0 0 9658368 291 /bin/bash -c
>>> /usr/lib/jvm/java-openjdk/bin/java -server -XX:OnOutOfMemoryError='kill %p'
>>> -Xms47924m -Xmx47924m '-verbose:gc' '-XX:+PrintGCDetails'
>>> '-XX:+PrintGCDateStamps' '-XX:+UseConcMarkSweepGC'
>>> '-XX:CMSInitiatingOccupancyFraction=70' '-XX:MaxHeapFreeRatio=70'
>>> '-XX:+CMSClassUnloadingEnabled'
>>> -Djava.io.tmpdir=/mnt/yarn/usercache/hadoop/appcache/application_1442869100946_0001/container_1442869100946_0001_01_000056/tmp
>>> '-Dspark.akka.failure-detector.threshold=3000.0'
>>> '-Dspark.akka.heartbeat.interval=10000s' '-Dspark.akka.threads=4'
>>> '-Dspark.history.ui.port=18080' '-Dspark.akka.heartbeat.pauses=60000s'
>>> '-Dspark.akka.timeout=1000s' '-Dspark.akka.frameSize=50'
>>> '-Dspark.driver.port=52690'
>>> -Dspark.yarn.app.container.log.dir=/var/log/hadoop-yarn/containers/application_1442869100946_0001/container_1442869100946_0001_01_000056
>>> org.apache.spark.executor.CoarseGrainedExecutorBackend --driver-url
>>> akka.tcp://sparkDriver@10.0.24.153:52690/user/CoarseGrainedScheduler
>>> --executor-id 55 --hostname ip-10-0-28-96.ec2.internal --cores 8 --app-id
>>> application_1442869100946_0001 --user-class-path
>>> file:/mnt/yarn/usercache/hadoop/appcache/application_1442869100946_0001/container_1442869100946_0001_01_000056/__app__.jar
>>> 1>
>>> /var/log/hadoop-yarn/containers/application_1442869100946_0001/container_1442869100946_0001_01_000056/stdout
>>> 2>
>>> /var/log/hadoop-yarn/containers/application_1442869100946_0001/container_1442869100946_0001_01_000056/stderr
>>>
>>>
>>>
>>> Is it possible to get what scratch space is used for?
>>>
>>> What spark setting should I try to adjust to solve the issue?
>>>
>>> On Thu, Sep 10, 2015 at 2:52 PM, Sandy Ryza <sandy.r...@cloudera.com>
>>> wrote:
>>>
>>>> YARN will never kill processes for being unresponsive.
>>>>
>>>> It may kill processes for occupying more memory than it allows.  To get
>>>> around this, you can either bump spark.yarn.executor.memoryOverhead or turn
>>>> off the memory checks entirely with yarn.nodemanager.pmem-check-enabled.
>>>>
>>>> -Sandy
>>>>
>>>> On Tue, Sep 8, 2015 at 10:48 PM, Alexander Pivovarov <
>>>> apivova...@gmail.com> wrote:
>>>>
>>>>> The problem which we have now is skew data (2360 tasks done in 5 min,
>>>>> 3 tasks in 40 min and 1 task in 2 hours)
>>>>>
>>>>> Some people from the team worry that the executor which runs the
>>>>> longest task can be killed by YARN (because executor might be unresponsive
>>>>> because of GC or it might occupy more memory than Yarn allows)
>>>>>
>>>>>
>>>>>
>>>>> On Tue, Sep 8, 2015 at 3:02 PM, Sandy Ryza <sandy.r...@cloudera.com>
>>>>> wrote:
>>>>>
>>>>>> Those settings seem reasonable to me.
>>>>>>
>>>>>> Are you observing performance that's worse than you would expect?
>>>>>>
>>>>>> -Sandy
>>>>>>
>>>>>> On Mon, Sep 7, 2015 at 11:22 AM, Alexander Pivovarov <
>>>>>> apivova...@gmail.com> wrote:
>>>>>>
>>>>>>> Hi Sandy
>>>>>>>
>>>>>>> Thank you for your reply
>>>>>>> Currently we use r3.2xlarge boxes (vCPU: 8, Mem: 61 GiB)
>>>>>>> with emr setting for Spark "maximizeResourceAllocation": "true"
>>>>>>>
>>>>>>> It is automatically converted to Spark settings
>>>>>>> spark.executor.memory            47924M
>>>>>>> spark.yarn.executor.memoryOverhead 5324
>>>>>>>
>>>>>>> we also set spark.default.parallelism = slave_count * 16
>>>>>>>
>>>>>>> Does it look good for you? (we run single heavy job on cluster)
>>>>>>>
>>>>>>> Alex
>>>>>>>
>>>>>>> On Mon, Sep 7, 2015 at 11:03 AM, Sandy Ryza <sandy.r...@cloudera.com
>>>>>>> > wrote:
>>>>>>>
>>>>>>>> Hi Alex,
>>>>>>>>
>>>>>>>> If they're both configured correctly, there's no reason that Spark
>>>>>>>> Standalone should provide performance or memory improvement over Spark 
>>>>>>>> on
>>>>>>>> YARN.
>>>>>>>>
>>>>>>>> -Sandy
>>>>>>>>
>>>>>>>> On Fri, Sep 4, 2015 at 1:24 PM, Alexander Pivovarov <
>>>>>>>> apivova...@gmail.com> wrote:
>>>>>>>>
>>>>>>>>> Hi Everyone
>>>>>>>>>
>>>>>>>>> We are trying the latest aws emr-4.0.0 and Spark and my question
>>>>>>>>> is about YARN vs Standalone mode.
>>>>>>>>> Our usecase is
>>>>>>>>> - start 100-150 nodes cluster every week,
>>>>>>>>> - run one heavy spark job (5-6 hours)
>>>>>>>>> - save data to s3
>>>>>>>>> - stop cluster
>>>>>>>>>
>>>>>>>>> Officially aws emr-4.0.0 comes with Spark on Yarn
>>>>>>>>> It's probably possible to hack emr by creating bootstrap script
>>>>>>>>> which stops yarn and starts master and slaves on each computer  (to 
>>>>>>>>> start
>>>>>>>>> Spark in standalone mode)
>>>>>>>>>
>>>>>>>>> My questions are
>>>>>>>>> - Does Spark standalone provides significant performance / memory
>>>>>>>>> improvement in comparison to YARN mode?
>>>>>>>>> - Does it worth hacking official emr Spark on Yarn and switch
>>>>>>>>> Spark to Standalone mode?
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> I already created comparison table and want you to check if my
>>>>>>>>> understanding is correct
>>>>>>>>>
>>>>>>>>> Lets say r3.2xlarge computer has 52GB ram available for Spark
>>>>>>>>> Executor JVMs
>>>>>>>>>
>>>>>>>>>                     standalone to yarn comparison
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>                   STDLN   YARN
>>>>>>>>>
>>>>>>>>> can executor allocate up to 52GB ram                           -
>>>>>>>>> yes  |  yes
>>>>>>>>>
>>>>>>>>> will executor be unresponsive after using all 52GB ram because of
>>>>>>>>> GC - yes  |  yes
>>>>>>>>>
>>>>>>>>> additional JVMs on slave except of spark executor        - workr |
>>>>>>>>> node mngr
>>>>>>>>>
>>>>>>>>> are additional JVMs lightweight
>>>>>>>>>   - yes  |  yes
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Thank you
>>>>>>>>>
>>>>>>>>> Alex
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Re: Spark on Yarn vs Standalone

Reply via email to