[ https://issues.apache.org/jira/browse/TEZ-2404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jeff Zhang updated TEZ-2404: ---------------------------- Attachment: TEZ-2404-3.patch > Handle DataMovementEvent before its TaskAttemptCompletedEvent > ------------------------------------------------------------- > > Key: TEZ-2404 > URL: https://issues.apache.org/jira/browse/TEZ-2404 > Project: Apache Tez > Issue Type: Bug > Reporter: Jeff Zhang > Assignee: Jeff Zhang > Priority: Critical > Attachments: TEZ-2404-1.patch, TEZ-2404-2.patch, TEZ-2404-3.patch > > > TEZ-2325 route TASK_ATTEMPT_COMPLETED_EVENT directly to the attempt, but it > would cause recovery issue. Recovery need that DataMovement event is handled > before TaskAttemptCompletedEvent, otherwise DataMovement event may be lost in > recovering and cause the its dependent tasks hang. > 2 Ways to fix this issue. > 1. Still route TaskAtttemptCompletedEvent in Vertex > 2. route DataMovementEvent before TaskAttemptCompeltedEvent in > TezTaskAttemptListener -- This message was sent by Atlassian JIRA (v6.3.4#6332)