[ https://issues.apache.org/jira/browse/TEZ-3914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jonathan Eagles updated TEZ-3914: --------------------------------- Attachment: TEZ-3914.003.patch > Recovering a large DAG hang job > ------------------------------- > > Key: TEZ-3914 > URL: https://issues.apache.org/jira/browse/TEZ-3914 > Project: Apache Tez > Issue Type: Bug > Reporter: Jonathan Eagles > Assignee: Jonathan Eagles > Priority: Major > Attachments: TEZ-3914.001.patch, TEZ-3914.002.patch, > TEZ-3914.003.patch > > > Any failure to parse recovery event is ignore and treated as eof. Job can > hang since some task completions may be missed and shuffle will hang. -- This message was sent by Atlassian JIRA (v7.6.3#76005)