[ https://issues.apache.org/jira/browse/YARN-9043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16823914#comment-16823914 ]
Hadoop QA commented on YARN-9043: --------------------------------- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s{color} | {color:blue} Docker mode activated. {color} | | {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 7s{color} | {color:red} YARN-9043 does not apply to trunk. Rebase required? Wrong Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Issue | YARN-9043 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12949243/YARN-9043.001.patch | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/24004/console | | Powered by | Apache Yetus 0.8.0 http://yetus.apache.org | This message was automatically generated. > Inter-queue preemption sometimes starves an underserved queue when using > DominantResourceCalculator > --------------------------------------------------------------------------------------------------- > > Key: YARN-9043 > URL: https://issues.apache.org/jira/browse/YARN-9043 > Project: Hadoop YARN > Issue Type: Bug > Components: capacityscheduler > Affects Versions: 3.3.0 > Reporter: Tao Yang > Assignee: Tao Yang > Priority: Major > Attachments: YARN-9043.001.patch > > > To reproduce this problem in UT, we can setup a cluster with resource <40,18> > and create 3 queues and apps: > * queue a: guaranteed=<10,10>, used=<6,10> by app1 > * queue b: guaranteed=<20,6>, used=<20,8> by app2 > * queue c: guaranteed=<10,2>, used=<0,0>, pending=<1,1> > Queue c is an underserved queue, queue b overuses 2 cpu resource, we expect > app2 in queue b can be preempted but nothing happens. > This problem is related to Resources#greaterThan/lessThan, comparation > between two resources is based on the resource/cluster-resource ratio inside > DominantResourceCalculator#compare, in this way, the low weight resource may > be ignored, for the scenario in UT, take comparation between ideal assgined > resource and used resource: > * cluster resource is <40,18> > * ideal assigned resource of queue b is <20,6>, ideal-assigned-resource / > cluster-resource = <20, 6> / <40, 18> = max(20/40, 6/18) = 0.5 > * used resource of queue b is <20, 8>, used-resource / cluster-resource = > <20, 8> / <40, 18> = max(20/40, 8/18) = 0.5 > The results of {{Resources.greaterThan(rc, clusterResource, used, > idealAssigned)}} will be false instead of true, and there are some other > similar places have the same problem, so that preemption can't happen in > current logic. > To solve this problem, I propose to add > ResourceCalculator#isAnyMajorResourceGreaterThan method, inside > DominantResourceCalculator implements, it will compare every resource type > between two resources and return true if any major resource types of left > resource is greater than that of right resource, then replace > Resources#greaterThan with it in some places of inter-queue preemption with > this problem. > Other places called Resources#greaterThan and other comparations in scheduler > and other preemption processes may encounter the same problem, perhaps need > to check through all resource comparation places in YARN, we need further > discuss about this. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org