Chris created TEZ-4640:
--------------------------
Summary: there are many killed tez tasks in yarn
Key: TEZ-4640
URL: https://issues.apache.org/jira/browse/TEZ-4640
Project: Apache Tez
Issue Type: Bug
Affects Versions: 0.10.2
Environment: !image-2025-07-29-09-28-18-539.png!
Reporter: Chris
Attachments: image-2025-07-29-09-27-20-373.png,
image-2025-07-29-09-28-18-539.png, image-2025-07-29-09-35-54-230.png,
image-2025-07-29-09-38-53-667.png
I use Apache Dolphinscheduler to execute hive tasks via tez, however, there
would always be a lot of killed tez tasks in yarn Applications page.
The duration of these tasks are often very short. Mostly would be 1second .
And this is not caused by resource limit ,i only run a single job at one time.
I found this log in Dolphinscheduler task log
{code:java}
2025-07-28 21:33:11,599 INFO hive.HiveImport: 2025-07-28 21:33:11 INFO
TezClient:780 - Could not connect to AM, killing session via YARN,
sessionName=HIVE-24438a38-77f0-46fe-8e24-abf7559cf986,
applicationId=application_1753709175100_0002 {code}
And I checked the source code, this log is caused by
sessionShutdownSuccessful=false
!image-2025-07-29-09-35-54-230.png!
And this is caused by getAMProxy return null.
The question is why the if statement doesn't contains
YarnApplicationState.ACCEPTED ? !image-2025-07-29-09-38-53-667.png!
--
This message was sent by Atlassian Jira
(v8.20.10#820010)