[ https://issues.apache.org/jira/browse/YARN-8292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16477999#comment-16477999 ]
Wangda Tan commented on YARN-8292: ---------------------------------- [~jlowe], thanks for your review, bq. I think the check for a zero resource can be dropped and it simplifies to the toObtainAfterPreemption component-wise max'd with zero is less than the amount to obtain from the partition (after being max'd with zero). In other words, we want to preempt as long as we have some resources we want to obtain from the partition and preempting the container makes progress on at least one of the resource dimensions being requested from the partition. The second part is correct, that's why we added following check: {code} 198 doPreempt = doPreempt && (Resources.lessThan(rc, clusterResource, 199 Resources 200 .componentwiseMax(toObtainAfterPreemption, Resources.none()), 201 Resources.componentwiseMax(toObtainByPartition, Resources.none()))); {code} The check of {{toObtainAfterPreemption}} is to make sure we will not do over-preemption. For example, if a queue's res-to-obtain = (3,-1,-1), and a container is (4,1,1). Even if preempt the container can make positive contribution, we will not do this because after preemption, the queue becomes an under-utilized queue and it may preempt resources from other queues. Following logics are mostly to cover two cases to avoid over-preemption: {code} 195 doPreempt = Resources.greaterThanOrEqual(rc, clusterResource, 196 toObtainAfterPreemption, Resources.none()) || Resources 197 .isAnyMajorResourceZero(rc, toObtainAfterPreemption); {code} a. After preemption, there're some positive major resources. b. After preemption, there're at least one 0 major resources (which indicates that the queue is still satisfied after preemption). Please let me know if you still have any other questions. > Fix the dominant resource preemption cannot happen when some of the resource > vector becomes negative > ---------------------------------------------------------------------------------------------------- > > Key: YARN-8292 > URL: https://issues.apache.org/jira/browse/YARN-8292 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn > Reporter: Sumana Sathish > Assignee: Wangda Tan > Priority: Critical > Fix For: 3.2.0, 3.1.1 > > Attachments: YARN-8292.001.patch > > > > This is an example of the problem: (Same if we have more than 2 resources) > > Let's say we have 3 queues A/B/C. All containers with equal size <2,3> > > ||Queue||Guaranteed||Used ||Pending|| > |A|<20, 10>|<20,30>| | > |B|<20, 10>|0|0| > |C|<20, 10>|0|<20, 30>| > | | | | | > > Under current logic, A's calculated to-preempt (how much resource other queue > can preempt) will be <0, 20>. The preemption will not happen. However, under > the context of DRC, queue A is using more resource than guaranteed, so queue > C will be starved -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org