[
https://issues.apache.org/jira/browse/PIG-4911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15321382#comment-15321382
]
Daniel Dai commented on PIG-4911:
---------------------------------
Can we not copy TezConfiguration? Seems a smaller data structure containing the
flag is enough.
> Provide option to disable DAG recovery
> --------------------------------------
>
> Key: PIG-4911
> URL: https://issues.apache.org/jira/browse/PIG-4911
> Project: Pig
> Issue Type: Improvement
> Reporter: Rohini Palaniswamy
> Assignee: Rohini Palaniswamy
> Fix For: 0.17.0
>
> Attachments: PIG-4911-1.patch
>
>
> Tez 0.7 has lot of issues with DAG recovery with auto parallelism causing
> hung dags in many cases as it was not writing auto parallelism decisions to
> recovery history. Rewrite was done in Tez 0.8 to handle that.
> Code was added to Tez to automatically disable recovery if there was auto
> parallelism so that it would benefit both Pig and Tez. It works fine and the
> second AM attempt fails with DAG cannot be recovered error when it sees there
> are vertices with auto parallelism. But problem is it is hard to see what the
> actual problem is for the users and is hard to debug as well as the whole UI
> state is rewritten with the partial recovery information.
> Doing the disabling of recovery in Pig itself by setting
> tez.dag.recovery.enabled=false will make it not go for the second attempt at
> all which will eventually fail. It also makes it easy to debug the original
> failure.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)