[ 
https://issues.apache.org/jira/browse/YARN-6344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15969514#comment-15969514
 ] 

Konstantinos Karanasos commented on YARN-6344:
----------------------------------------------

Hi [~Huangkx6810]. I just uploaded a patch that can be applied in branch-2.8. I 
built and ran some tests locally and it looks good.

You cannot apply this new patch directly to 2.7 though. The major change 
between 2.7 and 2.8 is that you will need to apply the changes I did to 
{{RegularContainerAllocator}} directly to the {{LeafQueue}} instead. Try to see 
what I mean and if you need help with 2.7, let me know.

> Add parameter for rack locality delay in CapacityScheduler
> ----------------------------------------------------------
>
>                 Key: YARN-6344
>                 URL: https://issues.apache.org/jira/browse/YARN-6344
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: capacityscheduler
>            Reporter: Konstantinos Karanasos
>            Assignee: Konstantinos Karanasos
>             Fix For: 2.9.0, 3.0.0-alpha3
>
>         Attachments: YARN-6344.001.patch, YARN-6344.002.patch, 
> YARN-6344.003.patch, YARN-6344.004.patch, YARN-6344-branch-2.8.patch
>
>
> When relaxing locality from node to rack, the {{node-locality-parameter}} is 
> used: when scheduling opportunities for a scheduler key are more than the 
> value of this parameter, we relax locality and try to assign the container to 
> a node in the corresponding rack.
> On the other hand, when relaxing locality to off-switch (i.e., assign the 
> container anywhere in the cluster), we are using a {{localityWaitFactor}}, 
> which is computed based on the number of outstanding requests for a specific 
> scheduler key, which is divided by the size of the cluster. 
> In case of applications that request containers in big batches (e.g., 
> traditional MR jobs), and for relatively small clusters, the 
> localityWaitFactor does not affect relaxing locality much.
> However, in case of applications that request containers in small batches, 
> this load factor takes a very small value, which leads to assigning 
> off-switch containers too soon. This situation is even more pronounced in big 
> clusters.
> For example, if an application requests only one container per request, the 
> locality will be relaxed after a single missed scheduling opportunity.
> The purpose of this JIRA is to rethink the way we are relaxing locality for 
> off-switch assignments.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to