[ 
https://issues.apache.org/jira/browse/YARN-8200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17025285#comment-17025285
 ] 

Thomas Graves commented on YARN-8200:
-------------------------------------

After messing with this a bit more I removed the maximum allocation 
configurations after seeing the documentation didn't have them in the 2.10 
release. so removed this setting:

<property>
 <name>yarn.resource-types.yarn.io/gpu.maximum-allocation</name>
 <value>4</value>
 </property>

And it appears now  yarn doesn't allocate me a container unless it has 
fullfilled all of the gpus I requested.   So in this case my nodemanager has 4 
gpus so if I request 5 then it just hangs waiting to fullfill the request. This 
behavior is much better then giving me one that is less then I requested.

 

> Backport resource types/GPU features to branch-3.0/branch-2
> -----------------------------------------------------------
>
>                 Key: YARN-8200
>                 URL: https://issues.apache.org/jira/browse/YARN-8200
>             Project: Hadoop YARN
>          Issue Type: Task
>            Reporter: Jonathan Hung
>            Assignee: Jonathan Hung
>            Priority: Major
>              Labels: release-blocker
>             Fix For: 2.10.0
>
>         Attachments: YARN-8200-branch-2.001.patch, 
> YARN-8200-branch-2.002.patch, YARN-8200-branch-2.003.patch, 
> YARN-8200-branch-3.0.001.patch, 
> counter.scheduler.operation.allocate.csv.defaultResources, 
> counter.scheduler.operation.allocate.csv.gpuResources, synth_sls.json
>
>
> Currently we have a need for GPU scheduling on our YARN clusters to support 
> deep learning workloads. However, our main production clusters are running 
> older versions of branch-2 (2.7 in our case). To prevent supporting too many 
> very different hadoop versions across multiple clusters, we would like to 
> backport the resource types/resource profiles feature to branch-2, as well as 
> the GPU specific support.
>  
> We have done a trial backport of YARN-3926 and some miscellaneous patches in 
> YARN-7069 based on issues we uncovered, and the backport was fairly smooth. 
> We also did a trial backport of most of YARN-6223 (sans docker support).
>  
> Regarding the backports, perhaps we can do the development in a feature 
> branch and then merge to branch-2 when ready.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to