[jira] [Commented] (MAPREDUCE-4890) Invalid TaskImpl state transitions when task fails while speculating

2012-12-22 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13538801#comment-13538801
 ] 

Hudson commented on MAPREDUCE-4890:
---

Integrated in Hadoop-Mapreduce-trunk #1292 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1292/])
MAPREDUCE-4890. Invalid TaskImpl state transitions when task fails while 
speculating. Contributed by Jason Lowe (Revision 1425223)

 Result = FAILURE
jlowe : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1425223
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TaskImpl.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TestTaskImpl.java


> Invalid TaskImpl state transitions when task fails while speculating
> 
>
> Key: MAPREDUCE-4890
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4890
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mr-am
>Affects Versions: 2.0.2-alpha, 0.23.5
>Reporter: Jason Lowe
>Assignee: Jason Lowe
>Priority: Critical
> Fix For: 2.0.3-alpha, 0.23.6
>
> Attachments: MAPREDUCE-4890.patch
>
>
> There are a couple of issues when a task fails while speculating (i.e.: 
> multiple attempts are active):
> # The other active attempts are not killed.
> # TaskImpl's FAILED state does not handle the T_ATTEMPT_* set of events which 
> can be sent from the other active attempts.  These all need to be handled 
> since they can be sent asynchronously from the other active task attempts.
> Failure to handle this properly means jobs that are configured to normally 
> tolerate failures via mapreduce.map.failures.maxpercent or 
> mapreduce.reduce.failures.maxpercent and also speculate can easily end up 
> failing due to invalid state transitions rather than complete successfully 
> with a few explicitly allowed task failures.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4890) Invalid TaskImpl state transitions when task fails while speculating

2012-12-22 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13538787#comment-13538787
 ] 

Hudson commented on MAPREDUCE-4890:
---

Integrated in Hadoop-Hdfs-trunk #1262 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1262/])
MAPREDUCE-4890. Invalid TaskImpl state transitions when task fails while 
speculating. Contributed by Jason Lowe (Revision 1425223)

 Result = FAILURE
jlowe : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1425223
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TaskImpl.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TestTaskImpl.java


> Invalid TaskImpl state transitions when task fails while speculating
> 
>
> Key: MAPREDUCE-4890
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4890
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mr-am
>Affects Versions: 2.0.2-alpha, 0.23.5
>Reporter: Jason Lowe
>Assignee: Jason Lowe
>Priority: Critical
> Fix For: 2.0.3-alpha, 0.23.6
>
> Attachments: MAPREDUCE-4890.patch
>
>
> There are a couple of issues when a task fails while speculating (i.e.: 
> multiple attempts are active):
> # The other active attempts are not killed.
> # TaskImpl's FAILED state does not handle the T_ATTEMPT_* set of events which 
> can be sent from the other active attempts.  These all need to be handled 
> since they can be sent asynchronously from the other active task attempts.
> Failure to handle this properly means jobs that are configured to normally 
> tolerate failures via mapreduce.map.failures.maxpercent or 
> mapreduce.reduce.failures.maxpercent and also speculate can easily end up 
> failing due to invalid state transitions rather than complete successfully 
> with a few explicitly allowed task failures.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4890) Invalid TaskImpl state transitions when task fails while speculating

2012-12-22 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13538773#comment-13538773
 ] 

Hudson commented on MAPREDUCE-4890:
---

Integrated in Hadoop-Hdfs-0.23-Build #471 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/471/])
svn merge -c 1425223 FIXES: MAPREDUCE-4890. Invalid TaskImpl state 
transitions when task fails while speculating. Contributed by Jason Lowe 
(Revision 1425227)

 Result = UNSTABLE
jlowe : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1425227
Files : 
* /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TaskImpl.java
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TestTaskImpl.java


> Invalid TaskImpl state transitions when task fails while speculating
> 
>
> Key: MAPREDUCE-4890
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4890
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mr-am
>Affects Versions: 2.0.2-alpha, 0.23.5
>Reporter: Jason Lowe
>Assignee: Jason Lowe
>Priority: Critical
> Fix For: 2.0.3-alpha, 0.23.6
>
> Attachments: MAPREDUCE-4890.patch
>
>
> There are a couple of issues when a task fails while speculating (i.e.: 
> multiple attempts are active):
> # The other active attempts are not killed.
> # TaskImpl's FAILED state does not handle the T_ATTEMPT_* set of events which 
> can be sent from the other active attempts.  These all need to be handled 
> since they can be sent asynchronously from the other active task attempts.
> Failure to handle this properly means jobs that are configured to normally 
> tolerate failures via mapreduce.map.failures.maxpercent or 
> mapreduce.reduce.failures.maxpercent and also speculate can easily end up 
> failing due to invalid state transitions rather than complete successfully 
> with a few explicitly allowed task failures.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4890) Invalid TaskImpl state transitions when task fails while speculating

2012-12-22 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13538743#comment-13538743
 ] 

Hudson commented on MAPREDUCE-4890:
---

Integrated in Hadoop-Yarn-trunk #73 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/73/])
MAPREDUCE-4890. Invalid TaskImpl state transitions when task fails while 
speculating. Contributed by Jason Lowe (Revision 1425223)

 Result = SUCCESS
jlowe : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1425223
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TaskImpl.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TestTaskImpl.java


> Invalid TaskImpl state transitions when task fails while speculating
> 
>
> Key: MAPREDUCE-4890
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4890
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mr-am
>Affects Versions: 2.0.2-alpha, 0.23.5
>Reporter: Jason Lowe
>Assignee: Jason Lowe
>Priority: Critical
> Fix For: 2.0.3-alpha, 0.23.6
>
> Attachments: MAPREDUCE-4890.patch
>
>
> There are a couple of issues when a task fails while speculating (i.e.: 
> multiple attempts are active):
> # The other active attempts are not killed.
> # TaskImpl's FAILED state does not handle the T_ATTEMPT_* set of events which 
> can be sent from the other active attempts.  These all need to be handled 
> since they can be sent asynchronously from the other active task attempts.
> Failure to handle this properly means jobs that are configured to normally 
> tolerate failures via mapreduce.map.failures.maxpercent or 
> mapreduce.reduce.failures.maxpercent and also speculate can easily end up 
> failing due to invalid state transitions rather than complete successfully 
> with a few explicitly allowed task failures.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4890) Invalid TaskImpl state transitions when task fails while speculating

2012-12-21 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13538636#comment-13538636
 ] 

Hudson commented on MAPREDUCE-4890:
---

Integrated in Hadoop-trunk-Commit #3154 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/3154/])
MAPREDUCE-4890. Invalid TaskImpl state transitions when task fails while 
speculating. Contributed by Jason Lowe (Revision 1425223)

 Result = SUCCESS
jlowe : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1425223
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TaskImpl.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TestTaskImpl.java


> Invalid TaskImpl state transitions when task fails while speculating
> 
>
> Key: MAPREDUCE-4890
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4890
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mr-am
>Affects Versions: 2.0.2-alpha, 0.23.5
>Reporter: Jason Lowe
>Assignee: Jason Lowe
>Priority: Critical
> Attachments: MAPREDUCE-4890.patch
>
>
> There are a couple of issues when a task fails while speculating (i.e.: 
> multiple attempts are active):
> # The other active attempts are not killed.
> # TaskImpl's FAILED state does not handle the T_ATTEMPT_* set of events which 
> can be sent from the other active attempts.  These all need to be handled 
> since they can be sent asynchronously from the other active task attempts.
> Failure to handle this properly means jobs that are configured to normally 
> tolerate failures via mapreduce.map.failures.maxpercent or 
> mapreduce.reduce.failures.maxpercent and also speculate can easily end up 
> failing due to invalid state transitions rather than complete successfully 
> with a few explicitly allowed task failures.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4890) Invalid TaskImpl state transitions when task fails while speculating

2012-12-21 Thread Thomas Graves (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13538451#comment-13538451
 ] 

Thomas Graves commented on MAPREDUCE-4890:
--

+1 looks good. Thanks Jason. Go ahead and commit.

> Invalid TaskImpl state transitions when task fails while speculating
> 
>
> Key: MAPREDUCE-4890
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4890
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mr-am
>Affects Versions: 2.0.2-alpha, 0.23.5
>Reporter: Jason Lowe
>Assignee: Jason Lowe
>Priority: Critical
> Attachments: MAPREDUCE-4890.patch
>
>
> There are a couple of issues when a task fails while speculating (i.e.: 
> multiple attempts are active):
> # The other active attempts are not killed.
> # TaskImpl's FAILED state does not handle the T_ATTEMPT_* set of events which 
> can be sent from the other active attempts.  These all need to be handled 
> since they can be sent asynchronously from the other active task attempts.
> Failure to handle this properly means jobs that are configured to normally 
> tolerate failures via mapreduce.map.failures.maxpercent or 
> mapreduce.reduce.failures.maxpercent and also speculate can easily end up 
> failing due to invalid state transitions rather than complete successfully 
> with a few explicitly allowed task failures.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4890) Invalid TaskImpl state transitions when task fails while speculating

2012-12-19 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13536248#comment-13536248
 ] 

Hadoop QA commented on MAPREDUCE-4890:
--

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12561746/MAPREDUCE-4890.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3140//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3140//console

This message is automatically generated.

> Invalid TaskImpl state transitions when task fails while speculating
> 
>
> Key: MAPREDUCE-4890
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4890
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mr-am
>Affects Versions: 2.0.2-alpha, 0.23.5
>Reporter: Jason Lowe
>Assignee: Jason Lowe
>Priority: Critical
> Attachments: MAPREDUCE-4890.patch
>
>
> There are a couple of issues when a task fails while speculating (i.e.: 
> multiple attempts are active):
> # The other active attempts are not killed.
> # TaskImpl's FAILED state does not handle the T_ATTEMPT_* set of events which 
> can be sent from the other active attempts.  These all need to be handled 
> since they can be sent asynchronously from the other active task attempts.
> Failure to handle this properly means jobs that are configured to normally 
> tolerate failures via mapreduce.map.failures.maxpercent or 
> mapreduce.reduce.failures.maxpercent and also speculate can easily end up 
> failing due to invalid state transitions rather than complete successfully 
> with a few explicitly allowed task failures.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4890) Invalid TaskImpl state transitions when task fails while speculating

2012-12-19 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13536204#comment-13536204
 ] 

Jason Lowe commented on MAPREDUCE-4890:
---

I was wrong about the KILLED state.  KILL_WAIT should handle cleaning up any 
lingering attempts, and by the time the task transitions from KILL_WAIT to 
KILLED there should be no active task attempts and therefore no chance of 
receiving T_ATTEMPT_* events.

> Invalid TaskImpl state transitions when task fails while speculating
> 
>
> Key: MAPREDUCE-4890
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4890
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mr-am
>Affects Versions: 2.0.2-alpha, 0.23.5
>Reporter: Jason Lowe
>Priority: Critical
>
> There are a couple of issues when a task fails while speculating (i.e.: 
> multiple attempts are active):
> # The other active attempts are not killed.
> # TaskImpl's FAILED state does not handle the T_ATTEMPT_* set of events which 
> can be sent from the other active attempts.  These all need to be handled 
> since they can be sent asynchronously from the other active task attempts.
> Failure to handle this properly means jobs that are configured to normally 
> tolerate failures via mapreduce.map.failures.maxpercent or 
> mapreduce.reduce.failures.maxpercent and also speculate can easily end up 
> failing due to invalid state transitions rather than complete successfully 
> with a few explicitly allowed task failures.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4890) Invalid TaskImpl state transitions when task fails while speculating

2012-12-18 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13535605#comment-13535605
 ] 

Jason Lowe commented on MAPREDUCE-4890:
---

Note that it appears the task KILLED state also needs to handle the various 
T_ATTEMPT_* events since they could arrive asynchronously and legitimately be 
received in that state.

> Invalid TaskImpl state transitions when task fails while speculating
> 
>
> Key: MAPREDUCE-4890
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4890
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mr-am
>Affects Versions: 2.0.2-alpha, 0.23.5
>Reporter: Jason Lowe
>Priority: Critical
>
> There are a couple of issues when a task fails while speculating (i.e.: 
> multiple attempts are active):
> # The other active attempts are not killed.
> # TaskImpl's FAILED state does not handle the T_ATTEMPT_* set of events which 
> can be sent from the other active attempts.  These all need to be handled 
> since they can be sent asynchronously from the other active task attempts.
> Failure to handle this properly means jobs that are configured to normally 
> tolerate failures via mapreduce.map.failures.maxpercent or 
> mapreduce.reduce.failures.maxpercent and also speculate can easily end up 
> failing due to invalid state transitions rather than complete successfully 
> with a few explicitly allowed task failures.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4890) Invalid TaskImpl state transitions when task fails while speculating

2012-12-18 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13535603#comment-13535603
 ] 

Jason Lowe commented on MAPREDUCE-4890:
---

Example exception trace when a speculative attempt fails after the task already 
failed:

{noformat}
2012-12-18 01:06:35,885 INFO [AsyncDispatcher event handler] 
org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: 
attempt_1354689281155_256490_m_00_4 TaskAttempt Transitioned from 
FAIL_TASK_CLEANUP to FAILED
2012-12-18 01:06:35,887 ERROR [AsyncDispatcher event handler] 
org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl: Can't handle this event 
at current state for task_1354689281155_256490_m_00
org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: 
T_ATTEMPT_FAILED at FAILED
at 
org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
at 
org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:43)
at 
org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:445)
at 
org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl.handle(TaskImpl.java:642)
at 
org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl.handle(TaskImpl.java:95)
at 
org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskEventDispatcher.handle(MRAppMaster.java:984)
at 
org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskEventDispatcher.handle(MRAppMaster.java:978)
at 
org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:128)
at 
org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:77)
at java.lang.Thread.run(Thread.java:619)
2012-12-18 01:06:35,888 ERROR [AsyncDispatcher event handler] 
org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl: Invalid event 
T_ATTEMPT_FAILED on Task task_1354689281155_256490_m_00
2012-12-18 01:06:35,909 INFO [AsyncDispatcher event handler] 
org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: 
job_1354689281155_256490Job Transitioned from RUNNING to ERROR
{noformat}

> Invalid TaskImpl state transitions when task fails while speculating
> 
>
> Key: MAPREDUCE-4890
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4890
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mr-am
>Affects Versions: 2.0.2-alpha, 0.23.5
>Reporter: Jason Lowe
>Priority: Critical
>
> There are a couple of issues when a task fails while speculating (i.e.: 
> multiple attempts are active):
> # The other active attempts are not killed.
> # TaskImpl's FAILED state does not handle the T_ATTEMPT_* set of events which 
> can be sent from the other active attempts.  These all need to be handled 
> since they can be sent asynchronously from the other active task attempts.
> Failure to handle this properly means jobs that are configured to normally 
> tolerate failures via mapreduce.map.failures.maxpercent or 
> mapreduce.reduce.failures.maxpercent and also speculate can easily end up 
> failing due to invalid state transitions rather than complete successfully 
> with a few explicitly allowed task failures.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira