[ https://issues.apache.org/jira/browse/YARN-10868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Song Jiacheng updated YARN-10868: --------------------------------- Description: In FairScheduler, removing a app attempt will call MaxRunningAppsEnforcer#updateRunnabilityOnAppRemoval to find some non-runnable apps and make them not pending. This method will call updateAppsRunnability at the end, and set appsNowMaybeRunnable.size() as the method parameter "maxRunnableApps",as below: {code:java} updateAppsRunnability(appsNowMaybeRunnable, appsNowMaybeRunnable.size()); {code} updateAppsRunnability is below: {code:java} private void updateAppsRunnability(List<List<FSAppAttempt>> appsNowMaybeRunnable, int maxRunnableApps) { // Scan through and check whether this means that any apps are now runnable Iterator<FSAppAttempt> iter = new MultiListStartTimeIterator( appsNowMaybeRunnable); FSAppAttempt prev = null; List<FSAppAttempt> noLongerPendingApps = new ArrayList<FSAppAttempt>(); while (iter.hasNext()) { FSAppAttempt next = iter.next(); if (next == prev) { continue; } if (canAppBeRunnable(next.getQueue(), next)) { trackRunnableApp(next); FSAppAttempt appSched = next; next.getQueue().addApp(appSched, true); noLongerPendingApps.add(appSched); if (noLongerPendingApps.size() >= maxRunnableApps) { break; } } prev = next; } ... {code} maxRunnableApps is the number of apps which can be runnable because of the removal of previous attempts, but nowMaybeRunnable actually is a list of lists, and the size of nowMaybeRunnable is actually a size of queues, so this is a bug. was: In FairScheduler, remove a app attempt will call MaxRunningAppsEnforcer#updateRunnabilityOnAppRemoval to find some non-runnable apps and make them not pending. This method will call updateAppsRunnability at the end, and set appsNowMaybeRunnable.size() as the method parameter "maxRunnableApps",as below: {code:java} updateAppsRunnability(appsNowMaybeRunnable, appsNowMaybeRunnable.size()); {code} updateAppsRunnability is below: {code:java} private void updateAppsRunnability(List<List<FSAppAttempt>> appsNowMaybeRunnable, int maxRunnableApps) { // Scan through and check whether this means that any apps are now runnable Iterator<FSAppAttempt> iter = new MultiListStartTimeIterator( appsNowMaybeRunnable); FSAppAttempt prev = null; List<FSAppAttempt> noLongerPendingApps = new ArrayList<FSAppAttempt>(); while (iter.hasNext()) { FSAppAttempt next = iter.next(); if (next == prev) { continue; } if (canAppBeRunnable(next.getQueue(), next)) { trackRunnableApp(next); FSAppAttempt appSched = next; next.getQueue().addApp(appSched, true); noLongerPendingApps.add(appSched); if (noLongerPendingApps.size() >= maxRunnableApps) { break; } } prev = next; } ... {code} maxRunnableApps is the number of apps which can be runnable because of the removal of previous attempts, but nowMaybeRunnable actually is a list of lists, and the size of nowMaybeRunnable is actually a size of queues, so this is a bug. > FairScheduler: updateAppsRunnability never break > ------------------------------------------------ > > Key: YARN-10868 > URL: https://issues.apache.org/jira/browse/YARN-10868 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager > Affects Versions: 3.2.1 > Reporter: Song Jiacheng > Priority: Major > > In FairScheduler, removing a app attempt will call > MaxRunningAppsEnforcer#updateRunnabilityOnAppRemoval to find some > non-runnable apps and make them not pending. This method will call > updateAppsRunnability at the end, and set appsNowMaybeRunnable.size() as the > method parameter "maxRunnableApps",as below: > {code:java} > updateAppsRunnability(appsNowMaybeRunnable, > appsNowMaybeRunnable.size()); > {code} > updateAppsRunnability is below: > {code:java} > private void updateAppsRunnability(List<List<FSAppAttempt>> > appsNowMaybeRunnable, int maxRunnableApps) { > // Scan through and check whether this means that any apps are now > runnable > Iterator<FSAppAttempt> iter = new MultiListStartTimeIterator( > appsNowMaybeRunnable); > FSAppAttempt prev = null; > List<FSAppAttempt> noLongerPendingApps = new ArrayList<FSAppAttempt>(); > while (iter.hasNext()) { > FSAppAttempt next = iter.next(); > if (next == prev) { > continue; > } > if (canAppBeRunnable(next.getQueue(), next)) { > trackRunnableApp(next); > FSAppAttempt appSched = next; > next.getQueue().addApp(appSched, true); > noLongerPendingApps.add(appSched); > if (noLongerPendingApps.size() >= maxRunnableApps) { > break; > } > } > prev = next; > } > ... > {code} > maxRunnableApps is the number of apps which can be runnable because of the > removal of previous attempts, but nowMaybeRunnable actually is a list of > lists, and the size of nowMaybeRunnable is actually a size of queues, so this > is a bug. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org