[ 
https://issues.apache.org/jira/browse/YARN-10868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Song Jiacheng updated YARN-10868:
---------------------------------
    Description: 
In FairScheduler, removing a app attempt will call 
MaxRunningAppsEnforcer#updateRunnabilityOnAppRemoval to find some non-runnable 
apps and make them not pending. This method will call updateAppsRunnability at 
the end, and set appsNowMaybeRunnable.size() as the method parameter 
"maxRunnableApps",as below:

{code:java}
updateAppsRunnability(appsNowMaybeRunnable,
        appsNowMaybeRunnable.size());
{code}
updateAppsRunnability is below:

{code:java}
private void updateAppsRunnability(List<List<FSAppAttempt>>
      appsNowMaybeRunnable, int maxRunnableApps) {
    // Scan through and check whether this means that any apps are now runnable
    Iterator<FSAppAttempt> iter = new MultiListStartTimeIterator(
        appsNowMaybeRunnable);
    FSAppAttempt prev = null;
    List<FSAppAttempt> noLongerPendingApps = new ArrayList<FSAppAttempt>();
    while (iter.hasNext()) {
      FSAppAttempt next = iter.next();
      if (next == prev) {
        continue;
      }

      if (canAppBeRunnable(next.getQueue(), next)) {
        trackRunnableApp(next);
        FSAppAttempt appSched = next;
        next.getQueue().addApp(appSched, true);
        noLongerPendingApps.add(appSched);

        if (noLongerPendingApps.size() >= maxRunnableApps) {
          break;
        }
      }

      prev = next;
    }
...
{code}

maxRunnableApps is the number of apps which can be runnable because of the 
removal of previous attempts, but nowMaybeRunnable actually is a list of lists, 
and the size of nowMaybeRunnable is actually a size of queues, so this is a bug.


  was:
In FairScheduler, remove a app attempt will call 
MaxRunningAppsEnforcer#updateRunnabilityOnAppRemoval to find some non-runnable 
apps and make them not pending. This method will call updateAppsRunnability at 
the end, and set appsNowMaybeRunnable.size() as the method parameter 
"maxRunnableApps",as below:

{code:java}
updateAppsRunnability(appsNowMaybeRunnable,
        appsNowMaybeRunnable.size());
{code}
updateAppsRunnability is below:

{code:java}
private void updateAppsRunnability(List<List<FSAppAttempt>>
      appsNowMaybeRunnable, int maxRunnableApps) {
    // Scan through and check whether this means that any apps are now runnable
    Iterator<FSAppAttempt> iter = new MultiListStartTimeIterator(
        appsNowMaybeRunnable);
    FSAppAttempt prev = null;
    List<FSAppAttempt> noLongerPendingApps = new ArrayList<FSAppAttempt>();
    while (iter.hasNext()) {
      FSAppAttempt next = iter.next();
      if (next == prev) {
        continue;
      }

      if (canAppBeRunnable(next.getQueue(), next)) {
        trackRunnableApp(next);
        FSAppAttempt appSched = next;
        next.getQueue().addApp(appSched, true);
        noLongerPendingApps.add(appSched);

        if (noLongerPendingApps.size() >= maxRunnableApps) {
          break;
        }
      }

      prev = next;
    }
...
{code}

maxRunnableApps is the number of apps which can be runnable because of the 
removal of previous attempts, but nowMaybeRunnable actually is a list of lists, 
and the size of nowMaybeRunnable is actually a size of queues, so this is a bug.



> FairScheduler: updateAppsRunnability never break
> ------------------------------------------------
>
>                 Key: YARN-10868
>                 URL: https://issues.apache.org/jira/browse/YARN-10868
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: resourcemanager
>    Affects Versions: 3.2.1
>            Reporter: Song Jiacheng
>            Priority: Major
>
> In FairScheduler, removing a app attempt will call 
> MaxRunningAppsEnforcer#updateRunnabilityOnAppRemoval to find some 
> non-runnable apps and make them not pending. This method will call 
> updateAppsRunnability at the end, and set appsNowMaybeRunnable.size() as the 
> method parameter "maxRunnableApps",as below:
> {code:java}
> updateAppsRunnability(appsNowMaybeRunnable,
>         appsNowMaybeRunnable.size());
> {code}
> updateAppsRunnability is below:
> {code:java}
> private void updateAppsRunnability(List<List<FSAppAttempt>>
>       appsNowMaybeRunnable, int maxRunnableApps) {
>     // Scan through and check whether this means that any apps are now 
> runnable
>     Iterator<FSAppAttempt> iter = new MultiListStartTimeIterator(
>         appsNowMaybeRunnable);
>     FSAppAttempt prev = null;
>     List<FSAppAttempt> noLongerPendingApps = new ArrayList<FSAppAttempt>();
>     while (iter.hasNext()) {
>       FSAppAttempt next = iter.next();
>       if (next == prev) {
>         continue;
>       }
>       if (canAppBeRunnable(next.getQueue(), next)) {
>         trackRunnableApp(next);
>         FSAppAttempt appSched = next;
>         next.getQueue().addApp(appSched, true);
>         noLongerPendingApps.add(appSched);
>         if (noLongerPendingApps.size() >= maxRunnableApps) {
>           break;
>         }
>       }
>       prev = next;
>     }
> ...
> {code}
> maxRunnableApps is the number of apps which can be runnable because of the 
> removal of previous attempts, but nowMaybeRunnable actually is a list of 
> lists, and the size of nowMaybeRunnable is actually a size of queues, so this 
> is a bug.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to