[jira] [Assigned] (YARN-7327) CapacityScheduler: Allocate containers asynchronously by default

Syed Shameerur Rahman (Jira) Sun, 03 Nov 2024 06:01:04 -0800


     [ 
https://issues.apache.org/jira/browse/YARN-7327?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Syed Shameerur Rahman reassigned YARN-7327:
-------------------------------------------

    Assignee: Syed Shameerur Rahman

> CapacityScheduler: Allocate containers asynchronously by default
> ----------------------------------------------------------------
>
>                 Key: YARN-7327
>                 URL: https://issues.apache.org/jira/browse/YARN-7327
>             Project: Hadoop YARN
>          Issue Type: Improvement
>            Reporter: Craig Ingram
>            Assignee: Syed Shameerur Rahman
>            Priority: Trivial
>         Attachments: async-scheduling-results.md, schedule-async.png, 
> spark-on-yarn-schedule-async.ipynb, yarn-async-scheduling.png
>
>
> I was recently doing some research into Spark on YARN's startup time and 
> observed slow, synchronous allocation of containers/executors. I am testing 
> on a 4 node bare metal cluster w/48 cores and 128GB memory per node. YARN was 
> only allocating about 3 containers per second. Moreover when starting 3 Spark 
> applications at the same time with each requesting 44 containers, the first 
> application would get all 44 requested containers and then the next 
> application would start getting containers and so on.
>  
> From looking at the code, it appears this is by design. There is an 
> undocumented configuration variable that will enable asynchronous allocation 
> of containers. I'm sure I'm missing something, but why is this not the 
> default? Is there a bug or race condition in this code path? I've done some 
> testing with it and it's been working and is significantly faster.
>  
> Here's the config:
> `yarn.scheduler.capacity.schedule-asynchronously.enable`
>  
> Any help understanding this would be appreciated.
>  
> Thanks,
> Craig
>  
> If you're curious about the performance difference with this setting, here 
> are the results:
>  
> The following tool was used for the benchmarks:
> https://github.com/SparkTC/spark-bench
> h2. async scheduler research
> The goal of this test is to determine if running Spark on YARN with async 
> scheduling of containers reduces the amount of time required for an 
> application to receive all of its requested resources. This setting should 
> also reduce the overall runtime of short-lived applications/stages or 
> notebook paragraphs. This setting could prove crucial to achieving optimal 
> performance when sharing resources on a cluster with dynalloc enabled.
> h3. Test Setup
> Must update /etc/hadoop/conf/capacity-scheduler.xml (or through Ambari) 
> between runs.  
> `yarn.scheduler.capacity.schedule-asynchronously.enable=true|false`
> conf files request executors counts of:  
> * 2
> * 20
> * 50
> * 100
> The apps are being submitted to the default queue on each cluster which caps 
> at 48 cores on dynalloc and 72 cores on baremetal. The default queue was 
> expanded for the last two tests on baremetal so it could potentially take 
> advantage of all 144 cores.
> h3. Test Environments
> h4. dynalloc
> 4 VMs in Fyre (1 master, 3 workers)
> 8 CPUs/16 GB per node
> model name    : QEMU Virtual CPU version 2.5+  
> h4. baremetal
> 4 baremetal instances in Fyre (1 master, 3 workers)
> 48 CPUs/128GB per node
> model name    : Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz  
> h3. Using spark-bench with timedsleep workload sync
> h4. dynalloc
> || requested containers | avg | stdev||
> |2 | 23.814900 | 1.110725|
> |20 | 29.770250 | 0.830528|
> |50 | 44.486600 | 0.593516|
> |100 | 44.337700 | 0.490139|
> h4. baremetal - 2 queues splitting cluster 72 cores each
> || requested containers | avg | stdev||
> |2 | 14.827000 | 0.292290|
> |20 | 19.613150 | 0.155421|
> |50 | 30.768400 | 0.083400|
> |100 | 40.931850 | 0.092160|
> h4. baremetal - 1 queue to rule them all - 144 cores
> || requested containers | avg | stdev||
> |2 | 14.833050 | 0.334061|
> |20 | 19.575000 | 0.212836|
> |50 | 30.765350 | 0.111035|
> |100 | 41.763300 | 0.182700|
> h3. Using spark-bench with timedsleep workload async
> h4. dynalloc
> || requested containers | avg | stdev||
> |2 | 22.575150 | 0.574296|
> |20 | 26.904150 | 1.244602|
> |50 | 44.721800 | 0.655388|
> |100 | 44.570000 | 0.514540|
> h5. 2nd run  
> || requested containers | avg | stdev||
> |2 | 22.441200 | 0.715875|
> |20 | 26.683400 | 0.583762|
> |50 | 44.227250 | 0.512568|
> |100 | 44.238750 | 0.329712|
> h4. baremetal - 2 queues splitting cluster 72 cores each
> || requested containers | avg | stdev||
> |2 | 12.902350 | 0.125505|
> |20 | 13.830600 | 0.169598|
> |50 | 16.738050 | 0.265091|
> |100 | 40.654500 | 0.111417|
> h4. baremetal - 1 queue to rule them all - 144 cores
> || requested containers | avg | stdev||
> |2 | 12.987150 | 0.118169|
> |20 | 13.837150 | 0.145871|
> |50 | 16.816300 | 0.253437|
> |100 | 23.113450 | 0.320744|



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Assigned] (YARN-7327) CapacityScheduler: Allocate containers asynchronously by default

Reply via email to