[
https://issues.apache.org/jira/browse/TEZ-4580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Himanshu Mishra updated TEZ-4580:
---------------------------------
Description:
I observed intermittent high runtime of a TPCDS query with running with YARN
async scheduler
`{_}yarn.scheduler.capacity.schedule-asynchronously.enable=true` along with
container reuse.{_}
I found that preemption of lower priority containers was taking very long time
in such cases. Tez AM log had warning {{{}Expected delayed containers to be
empty.{}}}, followed by another {{Held container expected to be not null for a
non-AM-released container}} and after this only 1 container was getting
released, even when {{tez.am.preemption.percentage}} is high.
Further investigation lead to following conclusion:
1. [Warn log / Assertion
error|https://github.com/apache/tez/blob/master/tez-dag/src/main/java/org/apache/tez/dag/app/rm/YarnTaskSchedulerService.java#L1335]
thrown because in
[preemptIfNeeded()|https://github.com/apache/tez/blob/master/tez-dag/src/main/java/org/apache/tez/dag/app/rm/YarnTaskSchedulerService.java#L1314],
when releasing new containers, the loop counter is being decremented with each
`{{{}releaseUnassignedContainers{}}}`, leading to looping only half number of
times. By using another counter, assertion passes because of condition method
returns with check `{{{}if (numPendingRequestsToService < 1) {{}}}`.
2. In
[releaseContainer()|https://github.com/apache/tez/blob/master/tez-dag/src/main/java/org/apache/tez/dag/app/rm/YarnTaskSchedulerService.java#L1566],
the container is not getting removed from `{{{}delayedContainers{}}}` queue
and only from `{{{}heldContainers{}}}` map, hence same container is being
picked up for release in every iteration till next cycle of
`{{{}DelayedContainerManager{}}}` finds out that the container is not in
`{{{}heldContainers{}}}` and skips it [with log
|https://github.com/apache/tez/blob/master/tez-dag/src/main/java/org/apache/tez/dag/app/rm/YarnTaskSchedulerService.java#L2095]`{{{}Skipping
delayed container as container is no longer running, containerId=...{}}}`
was:
I __ observed intermittent high runtime of a TPCDS query with running with YARN
async scheduler
`{_}yarn.scheduler.capacity.schedule-asynchronously.enable=true`.{_}
I found that preemption of lower priority containers was taking very long time
in such cases. Tez AM log had warning {{{}Expected delayed containers to be
empty.{}}}, followed by another {{Held container expected to be not null for a
non-AM-released container}} and after this only 1 container was getting
released, even when {{tez.am.preemption.percentage}} is high.
Further investigation lead to following conclusion:
1. [Warn log / Assertion
error|https://github.com/apache/tez/blob/master/tez-dag/src/main/java/org/apache/tez/dag/app/rm/YarnTaskSchedulerService.java#L1335]
thrown because in
[preemptIfNeeded()|https://github.com/apache/tez/blob/master/tez-dag/src/main/java/org/apache/tez/dag/app/rm/YarnTaskSchedulerService.java#L1314],
when releasing new containers, the loop counter is being decremented with each
`{{{}releaseUnassignedContainers{}}}`, leading to looping only half number of
times. By using another counter, assertion passes because of condition method
returns with check `{{{}if (numPendingRequestsToService < 1) {{}}}`.
2. In
[releaseContainer()|https://github.com/apache/tez/blob/master/tez-dag/src/main/java/org/apache/tez/dag/app/rm/YarnTaskSchedulerService.java#L1566],
the container is not getting removed from `{{{}delayedContainers{}}}` queue
and only from `{{heldContainers}}` map, hence same container is being picked up
for release in every iteration till next cycle of
`{{{}DelayedContainerManager{}}}` finds out that the container is not in
`{{{}heldContainers{}}}` and skips it [with log
|https://github.com/apache/tez/blob/master/tez-dag/src/main/java/org/apache/tez/dag/app/rm/YarnTaskSchedulerService.java#L2095]`{{{}Skipping
delayed container as container is no longer running, containerId=...{}}}`
> Slow preemption of new containers when re-use is enabled
> --------------------------------------------------------
>
> Key: TEZ-4580
> URL: https://issues.apache.org/jira/browse/TEZ-4580
> Project: Apache Tez
> Issue Type: Improvement
> Reporter: Himanshu Mishra
> Assignee: Himanshu Mishra
> Priority: Major
>
> I observed intermittent high runtime of a TPCDS query with running with YARN
> async scheduler
> `{_}yarn.scheduler.capacity.schedule-asynchronously.enable=true` along with
> container reuse.{_}
>
> I found that preemption of lower priority containers was taking very long
> time in such cases. Tez AM log had warning {{{}Expected delayed containers to
> be empty.{}}}, followed by another {{Held container expected to be not null
> for a non-AM-released container}} and after this only 1 container was getting
> released, even when {{tez.am.preemption.percentage}} is high.
> Further investigation lead to following conclusion:
> 1. [Warn log / Assertion
> error|https://github.com/apache/tez/blob/master/tez-dag/src/main/java/org/apache/tez/dag/app/rm/YarnTaskSchedulerService.java#L1335]
> thrown because in
> [preemptIfNeeded()|https://github.com/apache/tez/blob/master/tez-dag/src/main/java/org/apache/tez/dag/app/rm/YarnTaskSchedulerService.java#L1314],
> when releasing new containers, the loop counter is being decremented with
> each `{{{}releaseUnassignedContainers{}}}`, leading to looping only half
> number of times. By using another counter, assertion passes because of
> condition method returns with check `{{{}if (numPendingRequestsToService < 1)
> {{}}}`.
> 2. In
> [releaseContainer()|https://github.com/apache/tez/blob/master/tez-dag/src/main/java/org/apache/tez/dag/app/rm/YarnTaskSchedulerService.java#L1566],
> the container is not getting removed from `{{{}delayedContainers{}}}` queue
> and only from `{{{}heldContainers{}}}` map, hence same container is being
> picked up for release in every iteration till next cycle of
> `{{{}DelayedContainerManager{}}}` finds out that the container is not in
> `{{{}heldContainers{}}}` and skips it [with log
> |https://github.com/apache/tez/blob/master/tez-dag/src/main/java/org/apache/tez/dag/app/rm/YarnTaskSchedulerService.java#L2095]`{{{}Skipping
> delayed container as container is no longer running, containerId=...{}}}`
--
This message was sent by Atlassian Jira
(v8.20.10#820010)