[ 
https://issues.apache.org/jira/browse/YARN-10868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Song Jiacheng updated YARN-10868:
---------------------------------
    Description: 
In FairScheduler, removing a app attempt will call 
MaxRunningAppsEnforcer#updateRunnabilityOnAppRemoval to find some non-runnable 
apps and make them not pending. This method will call updateAppsRunnability at 
the end, and set appsNowMaybeRunnable.size() as the method parameter 
"maxRunnableApps", as below:

{code:java}
updateAppsRunnability(appsNowMaybeRunnable,
        appsNowMaybeRunnable.size());
{code}
updateAppsRunnability is below:

{code:java}
private void updateAppsRunnability(List<List<FSAppAttempt>>
      appsNowMaybeRunnable, int maxRunnableApps) {
    // Scan through and check whether this means that any apps are now runnable
    Iterator<FSAppAttempt> iter = new MultiListStartTimeIterator(
        appsNowMaybeRunnable);
    FSAppAttempt prev = null;
    List<FSAppAttempt> noLongerPendingApps = new ArrayList<FSAppAttempt>();
    while (iter.hasNext()) {
      FSAppAttempt next = iter.next();
      if (next == prev) {
        continue;
      }

      if (canAppBeRunnable(next.getQueue(), next)) {
        trackRunnableApp(next);
        FSAppAttempt appSched = next;
        next.getQueue().addApp(appSched, true);
        noLongerPendingApps.add(appSched);

        if (noLongerPendingApps.size() >= maxRunnableApps) {
          break;
        }
      }

      prev = next;
    }
...
{code}

maxRunnableApps is the number of apps which can be runnable because of the 
removal of previous attempts, this method use this parameter to break from the 
loop. However, nowMaybeRunnable actually is a list of lists, and the size of 
nowMaybeRunnable is actually a size of queues, so this is a bug.


  was:
In FairScheduler, removing a app attempt will call 
MaxRunningAppsEnforcer#updateRunnabilityOnAppRemoval to find some non-runnable 
apps and make them not pending. This method will call updateAppsRunnability at 
the end, and set appsNowMaybeRunnable.size() as the method parameter 
"maxRunnableApps", as below:

{code:java}
updateAppsRunnability(appsNowMaybeRunnable,
        appsNowMaybeRunnable.size());
{code}
updateAppsRunnability is below:

{code:java}
private void updateAppsRunnability(List<List<FSAppAttempt>>
      appsNowMaybeRunnable, int maxRunnableApps) {
    // Scan through and check whether this means that any apps are now runnable
    Iterator<FSAppAttempt> iter = new MultiListStartTimeIterator(
        appsNowMaybeRunnable);
    FSAppAttempt prev = null;
    List<FSAppAttempt> noLongerPendingApps = new ArrayList<FSAppAttempt>();
    while (iter.hasNext()) {
      FSAppAttempt next = iter.next();
      if (next == prev) {
        continue;
      }

      if (canAppBeRunnable(next.getQueue(), next)) {
        trackRunnableApp(next);
        FSAppAttempt appSched = next;
        next.getQueue().addApp(appSched, true);
        noLongerPendingApps.add(appSched);

        if (noLongerPendingApps.size() >= maxRunnableApps) {
          break;
        }
      }

      prev = next;
    }
...
{code}

maxRunnableApps is the number of apps which can be runnable because of the 
removal of previous attempts, but nowMaybeRunnable actually is a list of lists, 
and the size of nowMaybeRunnable is actually a size of queues, so this is a bug.



> FairScheduler: updateAppsRunnability never break
> ------------------------------------------------
>
>                 Key: YARN-10868
>                 URL: https://issues.apache.org/jira/browse/YARN-10868
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: resourcemanager
>    Affects Versions: 3.2.1
>            Reporter: Song Jiacheng
>            Priority: Major
>
> In FairScheduler, removing a app attempt will call 
> MaxRunningAppsEnforcer#updateRunnabilityOnAppRemoval to find some 
> non-runnable apps and make them not pending. This method will call 
> updateAppsRunnability at the end, and set appsNowMaybeRunnable.size() as the 
> method parameter "maxRunnableApps", as below:
> {code:java}
> updateAppsRunnability(appsNowMaybeRunnable,
>         appsNowMaybeRunnable.size());
> {code}
> updateAppsRunnability is below:
> {code:java}
> private void updateAppsRunnability(List<List<FSAppAttempt>>
>       appsNowMaybeRunnable, int maxRunnableApps) {
>     // Scan through and check whether this means that any apps are now 
> runnable
>     Iterator<FSAppAttempt> iter = new MultiListStartTimeIterator(
>         appsNowMaybeRunnable);
>     FSAppAttempt prev = null;
>     List<FSAppAttempt> noLongerPendingApps = new ArrayList<FSAppAttempt>();
>     while (iter.hasNext()) {
>       FSAppAttempt next = iter.next();
>       if (next == prev) {
>         continue;
>       }
>       if (canAppBeRunnable(next.getQueue(), next)) {
>         trackRunnableApp(next);
>         FSAppAttempt appSched = next;
>         next.getQueue().addApp(appSched, true);
>         noLongerPendingApps.add(appSched);
>         if (noLongerPendingApps.size() >= maxRunnableApps) {
>           break;
>         }
>       }
>       prev = next;
>     }
> ...
> {code}
> maxRunnableApps is the number of apps which can be runnable because of the 
> removal of previous attempts, this method use this parameter to break from 
> the loop. However, nowMaybeRunnable actually is a list of lists, and the size 
> of nowMaybeRunnable is actually a size of queues, so this is a bug.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to