[jira] [Updated] (TEZ-2426) Task input not complete before sending Task completed event
[ https://issues.apache.org/jira/browse/TEZ-2426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth Seth updated TEZ-2426: Attachment: TEZ-2426.2.txt Updated patch to remove some unnecessary synchronization which causes the findbugs issues. > Task input not complete before sending Task completed event > --- > > Key: TEZ-2426 > URL: https://issues.apache.org/jira/browse/TEZ-2426 > Project: Apache Tez > Issue Type: Bug >Affects Versions: 0.7.0 >Reporter: Bikas Saha >Assignee: Siddharth Seth >Priority: Critical > Attachments: TEZ-2426.1.txt, TEZ-2426.2.txt, am.log, container.log > > > Sequence of events > 1) Task A starts in a container > 2) Task A complete event comes to AM > 3) Task B starts in the same container > 4) Task A's input calls some method on its context. Crashes with NPE > 5) The crash sends an input failed event for Task A to the AM > 6) Task A state machine crashes saying cannot handle failed after success > In some cases, it could be that status update event is also sent after > completion, though not sure if its related to the failed event being sent. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TEZ-2426) Task input not complete before sending Task completed event
[ https://issues.apache.org/jira/browse/TEZ-2426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth Seth updated TEZ-2426: Attachment: TEZ-2426.1.txt This should fix it. Main changes in the patch - Wait for the eventRouter thread to complete before considering a task as done and accepting the next one. - Fixed visibility concerns in *Context. - Moved some of the cleanup into LogicalIOProcessorRuntimeTask - since progress() etc can happen often and shouldn't hit a volatile. > Task input not complete before sending Task completed event > --- > > Key: TEZ-2426 > URL: https://issues.apache.org/jira/browse/TEZ-2426 > Project: Apache Tez > Issue Type: Bug >Reporter: Bikas Saha >Priority: Critical > Attachments: TEZ-2426.1.txt, am.log, container.log > > > Sequence of events > 1) Task A starts in a container > 2) Task A complete event comes to AM > 3) Task B starts in the same container > 4) Task A's input calls some method on its context. Crashes with NPE > 5) The crash sends an input failed event for Task A to the AM > 6) Task A state machine crashes saying cannot handle failed after success > In some cases, it could be that status update event is also sent after > completion, though not sure if its related to the failed event being sent. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TEZ-2426) Task input not complete before sending Task completed event
[ https://issues.apache.org/jira/browse/TEZ-2426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bikas Saha updated TEZ-2426: Attachment: container.log am.log /cc [~rajesh.balamohan] [~hitesh] [~sseth] > Task input not complete before sending Task completed event > --- > > Key: TEZ-2426 > URL: https://issues.apache.org/jira/browse/TEZ-2426 > Project: Apache Tez > Issue Type: Bug >Reporter: Bikas Saha >Priority: Critical > Attachments: am.log, container.log > > > Sequence of events > 1) Task A starts in a container > 2) Task A complete event comes to AM > 3) Task B starts in the same container > 4) Task A's input calls some method on its context. Crashes with NPE > 5) The crash sends an input failed event for Task A to the AM > 6) Task A state machine crashes saying cannot handle failed after success > In some cases, it could be that status update event is also sent after > completion, though not sure if its related to the failed event being sent. -- This message was sent by Atlassian JIRA (v6.3.4#6332)