[ https://issues.apache.org/jira/browse/YARN-5139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15637617#comment-15637617 ]
Wangda Tan commented on YARN-5139: ---------------------------------- Thanks [~curino] for the comment, very good point. Basically I agree with all of what you mentioned. Existing patch attached to YARN-5716 considers part of this. The {{ResourceCommitterService}} has a check: {code} 493 // Don't run schedule if we have some pending backlogs already 494 if (cs.getAsyncSchedulingPendingBacklogs() > 100) { 495 Thread.sleep(1); 496 } else{ 497 schedule(cs); 498 } {code} The {{getAsyncSchedulingPendingBacklogs}} can directly affect to fairness. We could make this from 100 to a configurable value. And of course, we can do better than this, for example, we can calculate allocated + potential_allocated for each queue and stop allocating container to a queue which is too much above the expected fair share. So considering size of the patch attached to YARN-5716, it might be better to have a separate patch to improve this part. Sounds like a reasonable plan? And please feel free to let me know if you have any other thoughts. I plan to commit YARN-5716 if you think it is fine. Thanks, > [Umbrella] Move YARN scheduler towards global scheduler > ------------------------------------------------------- > > Key: YARN-5139 > URL: https://issues.apache.org/jira/browse/YARN-5139 > Project: Hadoop YARN > Issue Type: New Feature > Reporter: Wangda Tan > Assignee: Wangda Tan > Attachments: Explanantions of Global Scheduling (YARN-5139) > Implementation.pdf, YARN-5139-Concurrent-scheduling-performance-report.pdf, > YARN-5139-Global-Schedulingd-esign-and-implementation-notes-v2.pdf, > YARN-5139-Global-Schedulingd-esign-and-implementation-notes.pdf, > YARN-5139.000.patch, wip-1.YARN-5139.patch, wip-2.YARN-5139.patch, > wip-3.YARN-5139.patch, wip-4.YARN-5139.patch, wip-5.YARN-5139.patch > > > Existing YARN scheduler is based on node heartbeat. This can lead to > sub-optimal decisions because scheduler can only look at one node at the time > when scheduling resources. > Pseudo code of existing scheduling logic looks like: > {code} > for node in allNodes: > Go to parentQueue > Go to leafQueue > for application in leafQueue.applications: > for resource-request in application.resource-requests > try to schedule on node > {code} > Considering future complex resource placement requirements, such as node > constraints (give me "a && b || c") or anti-affinity (do not allocate HBase > regionsevers and Storm workers on the same host), we may need to consider > moving YARN scheduler towards global scheduling. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org