[jira] [Updated] (TEZ-3914) Recovering a large DAG hang job
[ https://issues.apache.org/jira/browse/TEZ-3914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Eagles updated TEZ-3914: - Attachment: TEZ-3914.003.patch > Recovering a large DAG hang job > --- > > Key: TEZ-3914 > URL: https://issues.apache.org/jira/browse/TEZ-3914 > Project: Apache Tez > Issue Type: Bug >Reporter: Jonathan Eagles >Assignee: Jonathan Eagles >Priority: Major > Attachments: TEZ-3914.001.patch, TEZ-3914.002.patch, > TEZ-3914.003.patch > > > Any failure to parse recovery event is ignore and treated as eof. Job can > hang since some task completions may be missed and shuffle will hang. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (TEZ-3914) Recovering a large DAG hang job
[ https://issues.apache.org/jira/browse/TEZ-3914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Eagles updated TEZ-3914: - Attachment: TEZ-3914.002.patch > Recovering a large DAG hang job > --- > > Key: TEZ-3914 > URL: https://issues.apache.org/jira/browse/TEZ-3914 > Project: Apache Tez > Issue Type: Bug >Reporter: Jonathan Eagles >Assignee: Jonathan Eagles >Priority: Major > Attachments: TEZ-3914.001.patch, TEZ-3914.002.patch > > > Any failure to parse recovery event is ignore and treated as eof. Job can > hang since some task completions may be missed and shuffle will hang. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (TEZ-3914) Recovering a large DAG hang job
[ https://issues.apache.org/jira/browse/TEZ-3914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Eagles updated TEZ-3914: - Attachment: TEZ-3914.001.patch > Recovering a large DAG hang job > --- > > Key: TEZ-3914 > URL: https://issues.apache.org/jira/browse/TEZ-3914 > Project: Apache Tez > Issue Type: Bug >Reporter: Jonathan Eagles >Assignee: Jonathan Eagles >Priority: Major > Attachments: TEZ-3914.001.patch > > > Any failure to parse recovery event is ignore and treated as eof. Job can > hang since some task completions may be missed and shuffle will hang. -- This message was sent by Atlassian JIRA (v7.6.3#76005)