[jira] [Updated] (TEZ-4580) Slow preemption of new containers when re-use is enabled

Himanshu Mishra (Jira) Wed, 25 Sep 2024 07:44:20 -0700


     [ 
https://issues.apache.org/jira/browse/TEZ-4580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Himanshu Mishra updated TEZ-4580:
---------------------------------
    Description: 
I observed intermittent high runtime of a TPCDS query with running with YARN 
async scheduler 
`{_}yarn.scheduler.capacity.schedule-asynchronously.enable=true` along with 
container reuse.{_}

 

I found that preemption of lower priority containers was taking very long time 
in such cases. Tez AM log had warning {{{}Expected delayed containers to be 
empty.{}}}, followed by another {{Held container expected to be not null for a 
non-AM-released container}} and after this only 1 container was getting 
released, even when {{tez.am.preemption.percentage}} is high.

Further investigation lead to following conclusion:
1. [Warn log / Assertion 
error|https://github.com/apache/tez/blob/master/tez-dag/src/main/java/org/apache/tez/dag/app/rm/YarnTaskSchedulerService.java#L1335]
 thrown because in 
[preemptIfNeeded()|https://github.com/apache/tez/blob/master/tez-dag/src/main/java/org/apache/tez/dag/app/rm/YarnTaskSchedulerService.java#L1314],
 when releasing new containers, the loop counter is being decremented with each 
`{{{}releaseUnassignedContainers{}}}`, leading to looping only half number of 
times. By using another counter, assertion passes because of condition method 
returns with check `{{{}if (numPendingRequestsToService < 1) {{}}}`.

2. In 
[releaseContainer()|https://github.com/apache/tez/blob/master/tez-dag/src/main/java/org/apache/tez/dag/app/rm/YarnTaskSchedulerService.java#L1566],
 the container is not getting removed from `{{{}delayedContainers{}}}` queue 
and only from `{{{}heldContainers{}}}` map, hence same container is being 
picked up for release in every iteration till next cycle of 
`{{{}DelayedContainerManager{}}}` finds out that the container is not in 
`{{{}heldContainers{}}}` and skips it [with log 
|https://github.com/apache/tez/blob/master/tez-dag/src/main/java/org/apache/tez/dag/app/rm/YarnTaskSchedulerService.java#L2095]`{{{}Skipping
 delayed container as container is no longer running, containerId=...{}}}`

  was:
I __ observed intermittent high runtime of a TPCDS query with running with YARN 
async scheduler 
`{_}yarn.scheduler.capacity.schedule-asynchronously.enable=true`.{_}

 

I found that preemption of lower priority containers was taking very long time 
in such cases. Tez AM log had warning {{{}Expected delayed containers to be 
empty.{}}}, followed by another {{Held container expected to be not null for a 
non-AM-released container}} and after this only 1 container was getting 
released, even when {{tez.am.preemption.percentage}} is high.

Further investigation lead to following conclusion:
1. [Warn log / Assertion 
error|https://github.com/apache/tez/blob/master/tez-dag/src/main/java/org/apache/tez/dag/app/rm/YarnTaskSchedulerService.java#L1335]
 thrown because in 
[preemptIfNeeded()|https://github.com/apache/tez/blob/master/tez-dag/src/main/java/org/apache/tez/dag/app/rm/YarnTaskSchedulerService.java#L1314],
 when releasing new containers, the loop counter is being decremented with each 
`{{{}releaseUnassignedContainers{}}}`, leading to looping only half number of 
times. By using another counter, assertion passes because of condition method 
returns with check `{{{}if (numPendingRequestsToService < 1) {{}}}`.

2. In 
[releaseContainer()|https://github.com/apache/tez/blob/master/tez-dag/src/main/java/org/apache/tez/dag/app/rm/YarnTaskSchedulerService.java#L1566],
 the container is not getting removed from `{{{}delayedContainers{}}}` queue 
and only from `{{heldContainers}}` map, hence same container is being picked up 
for release in every iteration till next cycle of 
`{{{}DelayedContainerManager{}}}` finds out that the container is not in 
`{{{}heldContainers{}}}` and skips it [with log 
|https://github.com/apache/tez/blob/master/tez-dag/src/main/java/org/apache/tez/dag/app/rm/YarnTaskSchedulerService.java#L2095]`{{{}Skipping
 delayed container as container is no longer running, containerId=...{}}}`


> Slow preemption of new containers when re-use is enabled
> --------------------------------------------------------
>
>                 Key: TEZ-4580
>                 URL: https://issues.apache.org/jira/browse/TEZ-4580
>             Project: Apache Tez
>          Issue Type: Improvement
>            Reporter: Himanshu Mishra
>            Assignee: Himanshu Mishra
>            Priority: Major
>
> I observed intermittent high runtime of a TPCDS query with running with YARN 
> async scheduler 
> `{_}yarn.scheduler.capacity.schedule-asynchronously.enable=true` along with 
> container reuse.{_}
>  
> I found that preemption of lower priority containers was taking very long 
> time in such cases. Tez AM log had warning {{{}Expected delayed containers to 
> be empty.{}}}, followed by another {{Held container expected to be not null 
> for a non-AM-released container}} and after this only 1 container was getting 
> released, even when {{tez.am.preemption.percentage}} is high.
> Further investigation lead to following conclusion:
> 1. [Warn log / Assertion 
> error|https://github.com/apache/tez/blob/master/tez-dag/src/main/java/org/apache/tez/dag/app/rm/YarnTaskSchedulerService.java#L1335]
>  thrown because in 
> [preemptIfNeeded()|https://github.com/apache/tez/blob/master/tez-dag/src/main/java/org/apache/tez/dag/app/rm/YarnTaskSchedulerService.java#L1314],
>  when releasing new containers, the loop counter is being decremented with 
> each `{{{}releaseUnassignedContainers{}}}`, leading to looping only half 
> number of times. By using another counter, assertion passes because of 
> condition method returns with check `{{{}if (numPendingRequestsToService < 1) 
> {{}}}`.
> 2. In 
> [releaseContainer()|https://github.com/apache/tez/blob/master/tez-dag/src/main/java/org/apache/tez/dag/app/rm/YarnTaskSchedulerService.java#L1566],
>  the container is not getting removed from `{{{}delayedContainers{}}}` queue 
> and only from `{{{}heldContainers{}}}` map, hence same container is being 
> picked up for release in every iteration till next cycle of 
> `{{{}DelayedContainerManager{}}}` finds out that the container is not in 
> `{{{}heldContainers{}}}` and skips it [with log 
> |https://github.com/apache/tez/blob/master/tez-dag/src/main/java/org/apache/tez/dag/app/rm/YarnTaskSchedulerService.java#L2095]`{{{}Skipping
>  delayed container as container is no longer running, containerId=...{}}}`



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (TEZ-4580) Slow preemption of new containers when re-use is enabled

Reply via email to