[ https://issues.apache.org/jira/browse/YARN-6163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15866841#comment-15866841 ]
ASF GitHub Bot commented on YARN-6163: -------------------------------------- Github user templedf commented on a diff in the pull request: https://github.com/apache/hadoop/pull/192#discussion_r101159725 --- Diff: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSAppAttempt.java --- @@ -1147,24 +1147,32 @@ private static boolean checkAndMarkRRVisited( * starvation. */ List<ResourceRequest> getStarvedResourceRequests() { + // List of RRs we build in this method to return List<ResourceRequest> ret = new ArrayList<>(); + + // Track visited RRs to avoid the same RR at multiple locality levels Map<Priority, List<Resource>> visitedRRs= new HashMap<>(); + // Start with current starvation and track the pending amount Resource pending = getStarvation(); for (ResourceRequest rr : appSchedulingInfo.getAllResourceRequests()) { if (Resources.isNone(pending)) { + // Found enough RRs to match the starvation break; } + + // See if we have already seen this RR if (checkAndMarkRRVisited(visitedRRs, rr)) { continue; } - // Compute the number of containers of this capability that fit in the - // pending amount + // A RR can have multiple containers of a capability. We need to + // compute the number of containers that fit in "pending". int ratio = (int) Math.floor( --- End diff -- Given that ratio is the number of containers that fit in "pending," ratio is probably a bad name. That was a good chunk of my initial confusion. > FS Preemption is a trickle for severely starved applications > ------------------------------------------------------------ > > Key: YARN-6163 > URL: https://issues.apache.org/jira/browse/YARN-6163 > Project: Hadoop YARN > Issue Type: Sub-task > Components: fairscheduler > Affects Versions: 2.9.0 > Reporter: Karthik Kambatla > Assignee: Karthik Kambatla > Attachments: yarn-6163-1.patch, yarn-6163-2.patch > > > With current logic, only one RR is considered per each instance of marking an > application starved. This marking happens only on the update call that runs > every 500ms. Due to this, an application that is severely starved takes > forever to reach fairshare based on preemptions. -- This message was sent by Atlassian JIRA (v6.3.15#6346) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org