[ 
https://issues.apache.org/jira/browse/YARN-8771?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Yang updated YARN-8771:
---------------------------
    Description: 
We found this problem when cluster is almost but not exhausted (93% used), 
scheduler kept allocating for an app but always fail to commit, this can 
blocking requests from other apps and parts of cluster resource can't be used.

Reproduce this problem:
(1) use DominantResourceCalculator
(2) cluster resource has empty resource type, for example: gpu=0
(3) scheduler allocates container for app1 who has reserved containers and 
whose queue limit or user limit reached(used + required > limit). 

Reference codes in RegularContainerAllocator#assignContainer:
{code:java}
    // How much need to unreserve equals to:
    // max(required - headroom, amountNeedUnreserve)
    Resource headRoom = Resources.clone(currentResoureLimits.getHeadroom());
    Resource resourceNeedToUnReserve =
        Resources.max(rc, clusterResource,
            Resources.subtract(capability, headRoom),
            currentResoureLimits.getAmountNeededUnreserve());

    boolean needToUnreserve =
        Resources.greaterThan(rc, clusterResource,
            resourceNeedToUnReserve, Resources.none());
{code}
For example, resourceNeedToUnReserve can be <8GB, -6 cores, 0 gpu> when 
{{headRoom=<0GB, 8 vcores, 0 gpu>}} and {{capacity=<8GB, 2 vcores, 0 gpu>}}, 
needToUnreserve which is the result of {{Resources#greaterThan}} will be 
{{false}}. This is not reasonable because required resource did exceed the 
headroom and unreserve is needed.
After that, when reaching the unreserve process in 
RegularContainerAllocator#assignContainer, unreserve process will be skipped 
when shouldAllocOrReserveNewContainer is true (when required containers > 
reserved containers) and needToUnreserve is wrongly calculated to be false:
{code:java}
    if (availableContainers > 0) {
         if (rmContainer == null && reservationsContinueLooking
          && node.getLabels().isEmpty()) {
              if (!shouldAllocOrReserveNewContainer || needToUnreserve) {
                    ...    // unreserve process can be wrongly skipped here!!!
              }
         }
    }
{code}

  was:
We found this problem when cluster is almost but not exhausted (93% used), 
scheduler kept allocating for an app but always fail to commit, this can 
blocking requests from other apps and parts of cluster resource can't be used.

Reproduce this problem:
(1) use DominantResourceCalculator
(2) cluster resource has empty resource type, for example: gpu=0
(3) scheduler allocates container for app1 who has reserved containers and 
whose queue limit or user limit reached(used + required > limit). 

Reference codes in RegularContainerAllocator#assignContainer:
{code:java}
    // How much need to unreserve equals to:
    // max(required - headroom, amountNeedUnreserve)
    Resource headRoom = Resources.clone(currentResoureLimits.getHeadroom());
    Resource resourceNeedToUnReserve =
        Resources.max(rc, clusterResource,
            Resources.subtract(capability, headRoom),
            currentResoureLimits.getAmountNeededUnreserve());

    boolean needToUnreserve =
        Resources.greaterThan(rc, clusterResource,
            resourceNeedToUnReserve, Resources.none());
{code}
For example, value of resourceNeedToUnReserve can be <8GB, -6 cores, 0 gpu> 
when {{headRoom=<0GB, 8 vcores, 0 gpu>}} and {{capacity=<8GB, 2 vcores, 0 
gpu>}}, needToUnreserve which is the result of {{Resources#greaterThan}} will 
be {{false}} if using DominantResourceCalculator.  This is the not reasonable 
because required resource did exceed the headroom and unreserve is needed. 
After that, when reaching the unreserve process in 
RegularContainerAllocator#assignContainer, unreserve process will be skipped 
when shouldAllocOrReserveNewContainer is true (when required containers > 
reserved containers) and needToUnreserve is wrongly calculated to be false:
{code:java}
    if (availableContainers > 0) {
         if (rmContainer == null && reservationsContinueLooking
          && node.getLabels().isEmpty()) {
              if (!shouldAllocOrReserveNewContainer || needToUnreserve) {
                    ...    // unreserve process can be wrongly skipped here!!!
              }
         }
    }
{code}


> CapacityScheduler fails to unreserve when cluster resource contains empty 
> resource type
> ---------------------------------------------------------------------------------------
>
>                 Key: YARN-8771
>                 URL: https://issues.apache.org/jira/browse/YARN-8771
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: capacityscheduler
>    Affects Versions: 3.2.0
>            Reporter: Tao Yang
>            Assignee: Tao Yang
>            Priority: Critical
>         Attachments: YARN-8771.001.patch, YARN-8771.002.patch
>
>
> We found this problem when cluster is almost but not exhausted (93% used), 
> scheduler kept allocating for an app but always fail to commit, this can 
> blocking requests from other apps and parts of cluster resource can't be used.
> Reproduce this problem:
> (1) use DominantResourceCalculator
> (2) cluster resource has empty resource type, for example: gpu=0
> (3) scheduler allocates container for app1 who has reserved containers and 
> whose queue limit or user limit reached(used + required > limit). 
> Reference codes in RegularContainerAllocator#assignContainer:
> {code:java}
>     // How much need to unreserve equals to:
>     // max(required - headroom, amountNeedUnreserve)
>     Resource headRoom = Resources.clone(currentResoureLimits.getHeadroom());
>     Resource resourceNeedToUnReserve =
>         Resources.max(rc, clusterResource,
>             Resources.subtract(capability, headRoom),
>             currentResoureLimits.getAmountNeededUnreserve());
>     boolean needToUnreserve =
>         Resources.greaterThan(rc, clusterResource,
>             resourceNeedToUnReserve, Resources.none());
> {code}
> For example, resourceNeedToUnReserve can be <8GB, -6 cores, 0 gpu> when 
> {{headRoom=<0GB, 8 vcores, 0 gpu>}} and {{capacity=<8GB, 2 vcores, 0 gpu>}}, 
> needToUnreserve which is the result of {{Resources#greaterThan}} will be 
> {{false}}. This is not reasonable because required resource did exceed the 
> headroom and unreserve is needed.
> After that, when reaching the unreserve process in 
> RegularContainerAllocator#assignContainer, unreserve process will be skipped 
> when shouldAllocOrReserveNewContainer is true (when required containers > 
> reserved containers) and needToUnreserve is wrongly calculated to be false:
> {code:java}
>     if (availableContainers > 0) {
>          if (rmContainer == null && reservationsContinueLooking
>           && node.getLabels().isEmpty()) {
>               if (!shouldAllocOrReserveNewContainer || needToUnreserve) {
>                     ...    // unreserve process can be wrongly skipped here!!!
>               }
>          }
>     }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to