[ https://issues.apache.org/jira/browse/YARN-5846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15646638#comment-15646638 ]
zhengchenyu edited comment on YARN-5846 at 11/8/16 6:41 AM: ------------------------------------------------------------ Yeah! I thinks my suggestion may be a new scheduler. And YARN-5139 is indeed a good idea, I will follow this issue, thank you for you recommendation! As to this problem,I think a daemon thread would update the shares, and keep the sequence of the queue and applications. In One Node, the requests are order by this sequence. But I don't known which model is best. For examaple: (1) one node have one request RB tree. updating the sequence of the queue and applicaiton in a daemon thread will update the sequence (this idea derives from fair-scheduler of linux kernel, and is compared to the cpu, and request is compared to task). Then the leftmost node would be the next assigned request. (2) a global daemon thread update every queue and application, and calculate their share. and request of one node's share is multiplied by its priority, then sort all the request. we assigned the container by this sequence. Notes: Why do I consider locality as primary? Because our cluster has many hot data, many nodes will connect these hot data. adding the replications blindly is Unrealistic, and network bandwidth is our bottleneck. So I want increase the locality as far as possible. But high locality affect the rate assigned container. was (Author: zhengchenyu): Yeah! I thinks my suggestion may be a new scheduler. And YARN-5139 is indeed a good idea, I will follow this issue, thank you for you recommendation! As to this problem,I think a daemon thread would update the shares, and keep the sequence of the queue and applications. In One Node, the requests are order by this sequence. But I don't known which model is best. For examaple: (1) one node have one request RB tree. updating the sequence of the queue and applicaiton in a daemon thread will update the sequence (this idea derives from fair-scheduler of linux kernel, and is compared to the cpu, and request is compared to task). Then the leftmost node would be the next assigned request. (2) a global daemon thread update every queue and application, and calculate their share. and request of one node's share is multiplied by its priority, then sort all the request. we assigned the container by this sequence. > Improve the fairscheduler attemptScheduler > ------------------------------------------- > > Key: YARN-5846 > URL: https://issues.apache.org/jira/browse/YARN-5846 > Project: Hadoop YARN > Issue Type: Improvement > Components: fairscheduler > Affects Versions: 2.7.1 > Environment: CentOS-7.1 > Reporter: zhengchenyu > Priority: Minor > Labels: fairscheduler > Fix For: 2.7.1 > > Original Estimate: 1m > Remaining Estimate: 1m > > when I assign a container, we must consider two factor: > (1) sort the queue and application, and select the proper request. > (2) then we assure this request's host is just this node (data locality). > or skip this loop! > this algorithm regard the sorting queue and application as primary factor. > when yarn consider data locality, for example, > yarn.scheduler.fair.locality.threshold.node=1, > yarn.scheduler.fair.locality.threshold.rack=1 (or > yarn.scheduler.fair.locality-delay-rack-ms and > yarn.scheduler.fair.locality-delay-node-ms is very large) and lots of > applications are runnig, the process of assigning contianer becomes very slow. > I think data locality is more important then the sequence of the queue and > applications. > I wanna a new algorithm like this: > (1) when resourcemanager accept a new request, notice the RMNodeImpl, > and then record this association between RMNode and request > (2) when assign containers for node, we assign container by > RMNodeImpl's association between RMNode and request directly > (3) then I consider the priority of queue and applation. In one object > of RMNodeImpl, we sort the request of association. > (4) and I think the sorting of current algorithm is consuming, in > especial, losts of applications are running, lots of sorting are called. so I > think we should sort the queue and applicaiton in a daemon thread, because > less error of queues's sequences is allowed. > > > > -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org