[ https://issues.apache.org/jira/browse/PIG-4911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Rohini Palaniswamy updated PIG-4911: ------------------------------------ Resolution: Fixed Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) Committed to trunk. Thanks for the review Daniel. > Provide option to disable DAG recovery > -------------------------------------- > > Key: PIG-4911 > URL: https://issues.apache.org/jira/browse/PIG-4911 > Project: Pig > Issue Type: Improvement > Reporter: Rohini Palaniswamy > Assignee: Rohini Palaniswamy > Fix For: 0.17.0 > > Attachments: PIG-4911-1.patch, PIG-4911-2.patch > > > Tez 0.7 has lot of issues with DAG recovery with auto parallelism causing > hung dags in many cases as it was not writing auto parallelism decisions to > recovery history. Rewrite was done in Tez 0.8 to handle that. > Code was added to Tez to automatically disable recovery if there was auto > parallelism so that it would benefit both Pig and Tez. It works fine and the > second AM attempt fails with DAG cannot be recovered error when it sees there > are vertices with auto parallelism. But problem is it is hard to see what the > actual problem is for the users and is hard to debug as well as the whole UI > state is rewritten with the partial recovery information. > Doing the disabling of recovery in Pig itself by setting > tez.dag.recovery.enabled=false will make it not go for the second attempt at > all which will eventually fail. It also makes it easy to debug the original > failure. -- This message was sent by Atlassian JIRA (v6.3.4#6332)