[ 
https://issues.apache.org/jira/browse/YARN-5846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15650458#comment-15650458
 ] 

zhangyubiao commented on YARN-5846:
-----------------------------------

:)

> Improve the fairscheduler attemptScheduler 
> -------------------------------------------
>
>                 Key: YARN-5846
>                 URL: https://issues.apache.org/jira/browse/YARN-5846
>             Project: Hadoop YARN
>          Issue Type: Improvement
>          Components: fairscheduler
>    Affects Versions: 2.7.1
>         Environment: CentOS-7.1
>            Reporter: zhengchenyu
>            Priority: Critical
>              Labels: fairscheduler
>             Fix For: 2.7.1
>
>   Original Estimate: 1m
>  Remaining Estimate: 1m
>
> when I assign a container, we must consider two factor:
>     (1) sort the queue and application, and select the proper request. 
>     (2) then we assure this request's host is just this node (data locality). 
> or skip this loop!
> this algorithm regard the sorting queue and application as primary factor. 
> when yarn consider data locality, for example, 
> yarn.scheduler.fair.locality.threshold.node=1, 
> yarn.scheduler.fair.locality.threshold.rack=1 (or 
> yarn.scheduler.fair.locality-delay-rack-ms and 
> yarn.scheduler.fair.locality-delay-node-ms is very large) and lots of 
> applications are runnig, the process of assigning contianer becomes very slow.
> I think data locality is more important then the sequence of the queue and 
> applications. 
> I wanna a new algorithm like this:
>       (1) when resourcemanager accept a new request, notice the RMNodeImpl, 
> and then record this association between RMNode and request
>       (2) when assign containers for node, we assign container by 
> RMNodeImpl's association between RMNode and request directly
>       (3) then I consider the priority of queue and applation. In one object 
> of RMNodeImpl, we sort the request of association.
>       (4) and I think the sorting of current algorithm is consuming, in 
> especial, losts of applications are running, lots of sorting are called. so I 
> think we should sort the queue and applicaiton in a daemon thread, because 
> less error of queues's sequences is allowed.
>       
>       
>       
>       



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to