[
https://issues.apache.org/jira/browse/TEZ-4445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17581623#comment-17581623
]
zhangbutao commented on TEZ-4445:
---------------------------------
PR: https://github.com/apache/tez/pull/238
> Tez task can get stuck when waiting for all initializers on
> LogicalIOProcessorRuntimeTask:initialize
> ----------------------------------------------------------------------------------------------------
>
> Key: TEZ-4445
> URL: https://issues.apache.org/jira/browse/TEZ-4445
> Project: Apache Tez
> Issue Type: Improvement
> Affects Versions: 0.10.2
> Reporter: zhangbutao
> Assignee: zhangbutao
> Priority: Major
> Attachments:
> Tez-task-stuck-LogicalIOProcessorRuntimeTask-initialize.jpg
>
>
> Cluster environment: Haoop 3.1.0, Hive 3.1.0, Tez 0.9.2
> In a busy cluster, i find some tez tasks can get stuck on
> LogicalIOProcessorRuntimeTask:initialize and wait for all initializers to be
> finished. This bad tez task can cause entire tez job to run forever. If i
> kill the tez job and resubmit it, the job often can run successfully. Please
> see more infomation from task jstack attachement
> _*Tez-task-stuck-LogicalIOProcessorRuntimeTask-initialize.jpg*_
> I have not find root cause which leaded to the task getting stuck, but i
> think it is a good way to add a timeout when waiting for initializers. In
> this way, the stuck task can be interupped beyond a certain time, and the
> attempt task can be launched immediately.
>
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)