[jira] [Updated] (TEZ-2426) Task input not complete before sending Task completed event

2015-05-07 Thread Siddharth Seth (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-2426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated TEZ-2426:

Attachment: TEZ-2426.2.txt

Updated patch to remove some unnecessary synchronization which causes the 
findbugs issues.

> Task input not complete before sending Task completed event
> ---
>
> Key: TEZ-2426
> URL: https://issues.apache.org/jira/browse/TEZ-2426
> Project: Apache Tez
>  Issue Type: Bug
>Affects Versions: 0.7.0
>Reporter: Bikas Saha
>Assignee: Siddharth Seth
>Priority: Critical
> Attachments: TEZ-2426.1.txt, TEZ-2426.2.txt, am.log, container.log
>
>
> Sequence of events
> 1) Task A starts in a container
> 2) Task A complete event comes to AM
> 3) Task B starts in the same container
> 4) Task A's input calls some method on its context. Crashes with NPE
> 5) The crash sends an input failed event for Task A to the AM
> 6) Task A state machine crashes saying cannot handle failed after success
> In some cases, it could be that status update event is also sent after 
> completion, though not sure if its related to the failed event being sent.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-2426) Task input not complete before sending Task completed event

2015-05-07 Thread Siddharth Seth (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-2426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated TEZ-2426:

Attachment: TEZ-2426.1.txt

This should fix it. Main changes in the patch
- Wait for the eventRouter thread to complete before considering a task as done 
and accepting the next one.
- Fixed visibility concerns in *Context.
- Moved some of the cleanup into LogicalIOProcessorRuntimeTask - since 
progress() etc can happen often and shouldn't hit a volatile.

> Task input not complete before sending Task completed event
> ---
>
> Key: TEZ-2426
> URL: https://issues.apache.org/jira/browse/TEZ-2426
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Bikas Saha
>Priority: Critical
> Attachments: TEZ-2426.1.txt, am.log, container.log
>
>
> Sequence of events
> 1) Task A starts in a container
> 2) Task A complete event comes to AM
> 3) Task B starts in the same container
> 4) Task A's input calls some method on its context. Crashes with NPE
> 5) The crash sends an input failed event for Task A to the AM
> 6) Task A state machine crashes saying cannot handle failed after success
> In some cases, it could be that status update event is also sent after 
> completion, though not sure if its related to the failed event being sent.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-2426) Task input not complete before sending Task completed event

2015-05-06 Thread Bikas Saha (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-2426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bikas Saha updated TEZ-2426:

Attachment: container.log
am.log

/cc [~rajesh.balamohan] [~hitesh] [~sseth]

> Task input not complete before sending Task completed event
> ---
>
> Key: TEZ-2426
> URL: https://issues.apache.org/jira/browse/TEZ-2426
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Bikas Saha
>Priority: Critical
> Attachments: am.log, container.log
>
>
> Sequence of events
> 1) Task A starts in a container
> 2) Task A complete event comes to AM
> 3) Task B starts in the same container
> 4) Task A's input calls some method on its context. Crashes with NPE
> 5) The crash sends an input failed event for Task A to the AM
> 6) Task A state machine crashes saying cannot handle failed after success
> In some cases, it could be that status update event is also sent after 
> completion, though not sure if its related to the failed event being sent.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)