[
https://issues.apache.org/jira/browse/YARN-7327?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Syed Shameerur Rahman reassigned YARN-7327:
-------------------------------------------
Assignee: Syed Shameerur Rahman
> CapacityScheduler: Allocate containers asynchronously by default
> ----------------------------------------------------------------
>
> Key: YARN-7327
> URL: https://issues.apache.org/jira/browse/YARN-7327
> Project: Hadoop YARN
> Issue Type: Improvement
> Reporter: Craig Ingram
> Assignee: Syed Shameerur Rahman
> Priority: Trivial
> Attachments: async-scheduling-results.md, schedule-async.png,
> spark-on-yarn-schedule-async.ipynb, yarn-async-scheduling.png
>
>
> I was recently doing some research into Spark on YARN's startup time and
> observed slow, synchronous allocation of containers/executors. I am testing
> on a 4 node bare metal cluster w/48 cores and 128GB memory per node. YARN was
> only allocating about 3 containers per second. Moreover when starting 3 Spark
> applications at the same time with each requesting 44 containers, the first
> application would get all 44 requested containers and then the next
> application would start getting containers and so on.
>
> From looking at the code, it appears this is by design. There is an
> undocumented configuration variable that will enable asynchronous allocation
> of containers. I'm sure I'm missing something, but why is this not the
> default? Is there a bug or race condition in this code path? I've done some
> testing with it and it's been working and is significantly faster.
>
> Here's the config:
> `yarn.scheduler.capacity.schedule-asynchronously.enable`
>
> Any help understanding this would be appreciated.
>
> Thanks,
> Craig
>
> If you're curious about the performance difference with this setting, here
> are the results:
>
> The following tool was used for the benchmarks:
> https://github.com/SparkTC/spark-bench
> h2. async scheduler research
> The goal of this test is to determine if running Spark on YARN with async
> scheduling of containers reduces the amount of time required for an
> application to receive all of its requested resources. This setting should
> also reduce the overall runtime of short-lived applications/stages or
> notebook paragraphs. This setting could prove crucial to achieving optimal
> performance when sharing resources on a cluster with dynalloc enabled.
> h3. Test Setup
> Must update /etc/hadoop/conf/capacity-scheduler.xml (or through Ambari)
> between runs.
> `yarn.scheduler.capacity.schedule-asynchronously.enable=true|false`
> conf files request executors counts of:
> * 2
> * 20
> * 50
> * 100
> The apps are being submitted to the default queue on each cluster which caps
> at 48 cores on dynalloc and 72 cores on baremetal. The default queue was
> expanded for the last two tests on baremetal so it could potentially take
> advantage of all 144 cores.
> h3. Test Environments
> h4. dynalloc
> 4 VMs in Fyre (1 master, 3 workers)
> 8 CPUs/16 GB per node
> model name : QEMU Virtual CPU version 2.5+
> h4. baremetal
> 4 baremetal instances in Fyre (1 master, 3 workers)
> 48 CPUs/128GB per node
> model name : Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz
> h3. Using spark-bench with timedsleep workload sync
> h4. dynalloc
> || requested containers | avg | stdev||
> |2 | 23.814900 | 1.110725|
> |20 | 29.770250 | 0.830528|
> |50 | 44.486600 | 0.593516|
> |100 | 44.337700 | 0.490139|
> h4. baremetal - 2 queues splitting cluster 72 cores each
> || requested containers | avg | stdev||
> |2 | 14.827000 | 0.292290|
> |20 | 19.613150 | 0.155421|
> |50 | 30.768400 | 0.083400|
> |100 | 40.931850 | 0.092160|
> h4. baremetal - 1 queue to rule them all - 144 cores
> || requested containers | avg | stdev||
> |2 | 14.833050 | 0.334061|
> |20 | 19.575000 | 0.212836|
> |50 | 30.765350 | 0.111035|
> |100 | 41.763300 | 0.182700|
> h3. Using spark-bench with timedsleep workload async
> h4. dynalloc
> || requested containers | avg | stdev||
> |2 | 22.575150 | 0.574296|
> |20 | 26.904150 | 1.244602|
> |50 | 44.721800 | 0.655388|
> |100 | 44.570000 | 0.514540|
> h5. 2nd run
> || requested containers | avg | stdev||
> |2 | 22.441200 | 0.715875|
> |20 | 26.683400 | 0.583762|
> |50 | 44.227250 | 0.512568|
> |100 | 44.238750 | 0.329712|
> h4. baremetal - 2 queues splitting cluster 72 cores each
> || requested containers | avg | stdev||
> |2 | 12.902350 | 0.125505|
> |20 | 13.830600 | 0.169598|
> |50 | 16.738050 | 0.265091|
> |100 | 40.654500 | 0.111417|
> h4. baremetal - 1 queue to rule them all - 144 cores
> || requested containers | avg | stdev||
> |2 | 12.987150 | 0.118169|
> |20 | 13.837150 | 0.145871|
> |50 | 16.816300 | 0.253437|
> |100 | 23.113450 | 0.320744|
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]