[ https://issues.apache.org/jira/browse/TEZ-3817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16426270#comment-16426270 ]
Kuhu Shukla commented on TEZ-3817: ---------------------------------- Updated patch to add the try block to \{{dag.finished()}}. The \{{DAGAppMasterEventDAGFinished}} event should do the necessary IMO. If the AM is in session mode, today, the AM does not shutdown even if the DAG error-ed out. This patch maintains that behavior. > DAGs can hang after more than one uncaught Exception during doTransition. > ------------------------------------------------------------------------- > > Key: TEZ-3817 > URL: https://issues.apache.org/jira/browse/TEZ-3817 > Project: Apache Tez > Issue Type: Bug > Affects Versions: 0.7.1, 0.9.0 > Reporter: Kuhu Shukla > Assignee: Kuhu Shukla > Priority: Major > Attachments: TEZ-3817.001.patch, TEZ-3817.002.patch, > TEZ-3817.003.patch, TEZ-3817.test.patch > > > A Tez DAG can hang in the last "sane" state if the > statemachine.doTransition() throws a runtime exception more than once. The > transition for the Error state itself throws an exception, the DAG hangs. -- This message was sent by Atlassian JIRA (v7.6.3#76005)