[jira] [Commented] (TEZ-3066) TaskAttemptFinishedEvent ConcurrentModificationException in recovery or history logging services

2016-01-21 Thread Jeff Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15111081#comment-15111081
 ] 

Jeff Zhang commented on TEZ-3066:
-

This issue may happen when task attempt is in the process of kill 
(TaskAttemptStateInternal.KILL_IN_PROGRESS), in this case we have logged the 
TaskAttemptFinishedEvent, but the task attempt may still alive and will 
heartbeat with AM. which cause the ConcurrentModificationException here.  
Although it might be better to log TaskAttemptFinishedEvent in the last state, 
but it require much change on the TaskAttempt state machine, the easier way is 
to copy a new list for dataEvents to avoid the ConcurrentModificationException. 
Also check taGeneratedEvents in TaskAttemptFinishedEvent, it won't be updated 
after TaskAttemptFinishedEvent is created, otherwise it is some bug of 
TaskAttempt's state machine.  [~hitesh] [~bikassaha] Please help review it. 

> TaskAttemptFinishedEvent ConcurrentModificationException in recovery or 
> history logging services
> 
>
> Key: TEZ-3066
> URL: https://issues.apache.org/jira/browse/TEZ-3066
> Project: Apache Tez
>  Issue Type: Bug
>Affects Versions: 0.7.0
>Reporter: Jason Lowe
>Assignee: Jeff Zhang
> Attachments: TEZ-3066-1.patch
>
>
> A ConcurrentModificationException can occur if a TaskAttemptFinishedEvent is 
> processed by the recovery service or another history logging service.  Sample 
> stacktraces to follow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-3066) TaskAttemptFinishedEvent ConcurrentModificationException in recovery or history logging services

2016-01-21 Thread Bikas Saha (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15111659#comment-15111659
 ] 

Bikas Saha commented on TEZ-3066:
-

Please avoid bulk import. "+import org.apache.tez.dag.app.dag.*;"
Could you please create a jira for a better fix with some of the ideas 
discussed offline and reference that in a comment near the fix.

+1.

> TaskAttemptFinishedEvent ConcurrentModificationException in recovery or 
> history logging services
> 
>
> Key: TEZ-3066
> URL: https://issues.apache.org/jira/browse/TEZ-3066
> Project: Apache Tez
>  Issue Type: Bug
>Affects Versions: 0.7.0
>Reporter: Jason Lowe
>Assignee: Jeff Zhang
> Attachments: TEZ-3066-1.patch, TEZ-3066-2.patch
>
>
> A ConcurrentModificationException can occur if a TaskAttemptFinishedEvent is 
> processed by the recovery service or another history logging service.  Sample 
> stacktraces to follow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-3066) TaskAttemptFinishedEvent ConcurrentModificationException in recovery or history logging services

2016-01-21 Thread Jeff Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15111539#comment-15111539
 ] 

Jeff Zhang commented on TEZ-3066:
-

Discuss with [~bikassaha] offline, copy events for every task attempt will too 
costly. Attach another patch to avoid the event update in the middle of task 
attempt TERMINATING. 

> TaskAttemptFinishedEvent ConcurrentModificationException in recovery or 
> history logging services
> 
>
> Key: TEZ-3066
> URL: https://issues.apache.org/jira/browse/TEZ-3066
> Project: Apache Tez
>  Issue Type: Bug
>Affects Versions: 0.7.0
>Reporter: Jason Lowe
>Assignee: Jeff Zhang
> Attachments: TEZ-3066-1.patch, TEZ-3066-2.patch
>
>
> A ConcurrentModificationException can occur if a TaskAttemptFinishedEvent is 
> processed by the recovery service or another history logging service.  Sample 
> stacktraces to follow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-3066) TaskAttemptFinishedEvent ConcurrentModificationException in recovery or history logging services

2016-01-21 Thread TezQA (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15111816#comment-15111816
 ] 

TezQA commented on TEZ-3066:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12783712/TEZ-3066-3.patch
  against master revision 92def52.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 3.0.1) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-TEZ-Build/1431//testReport/
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/1431//console

This message is automatically generated.

> TaskAttemptFinishedEvent ConcurrentModificationException in recovery or 
> history logging services
> 
>
> Key: TEZ-3066
> URL: https://issues.apache.org/jira/browse/TEZ-3066
> Project: Apache Tez
>  Issue Type: Bug
>Affects Versions: 0.7.0
>Reporter: Jason Lowe
>Assignee: Jeff Zhang
> Attachments: TEZ-3066-1.patch, TEZ-3066-2.patch, TEZ-3066-3.patch
>
>
> A ConcurrentModificationException can occur if a TaskAttemptFinishedEvent is 
> processed by the recovery service or another history logging service.  Sample 
> stacktraces to follow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-3066) TaskAttemptFinishedEvent ConcurrentModificationException in recovery or history logging services

2016-01-21 Thread TezQA (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15111729#comment-15111729
 ] 

TezQA commented on TEZ-3066:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12783694/TEZ-3066-2.patch
  against master revision 92def52.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 3.0.1) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-TEZ-Build/1430//testReport/
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/1430//console

This message is automatically generated.

> TaskAttemptFinishedEvent ConcurrentModificationException in recovery or 
> history logging services
> 
>
> Key: TEZ-3066
> URL: https://issues.apache.org/jira/browse/TEZ-3066
> Project: Apache Tez
>  Issue Type: Bug
>Affects Versions: 0.7.0
>Reporter: Jason Lowe
>Assignee: Jeff Zhang
> Attachments: TEZ-3066-1.patch, TEZ-3066-2.patch, TEZ-3066-3.patch
>
>
> A ConcurrentModificationException can occur if a TaskAttemptFinishedEvent is 
> processed by the recovery service or another history logging service.  Sample 
> stacktraces to follow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-3066) TaskAttemptFinishedEvent ConcurrentModificationException in recovery or history logging services

2016-01-21 Thread Jeff Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15111968#comment-15111968
 ] 

Jeff Zhang commented on TEZ-3066:
-

[~jlowe] Thanks for reporting this issue, [~bikassaha] Thanks for review, 
committed to master & branch-0.7

> TaskAttemptFinishedEvent ConcurrentModificationException in recovery or 
> history logging services
> 
>
> Key: TEZ-3066
> URL: https://issues.apache.org/jira/browse/TEZ-3066
> Project: Apache Tez
>  Issue Type: Bug
>Affects Versions: 0.7.0
>Reporter: Jason Lowe
>Assignee: Jeff Zhang
> Fix For: 0.7.1, 0.8.3
>
> Attachments: TEZ-3066-1.patch, TEZ-3066-2.patch, TEZ-3066-3.patch
>
>
> A ConcurrentModificationException can occur if a TaskAttemptFinishedEvent is 
> processed by the recovery service or another history logging service.  Sample 
> stacktraces to follow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)