[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13409839#comment-13409839
 ] 

Robert Joseph Evans commented on MAPREDUCE-4252:
------------------------------------------------

In the transitions for MapRetroactiveFailureTransition the postStates are 
listed as only TaskState.SCHEDULED and TaskState.FAILED.  TaskState.SUCCEEDED 
needs to also be in there. 

This is causing an exception which results in internal error event to be sent, 
that we are not detecting in the tests.  It would be good to have the tests 
look for these internal error events, for all of the TaskImpl Tests (but that 
might be a separate JIRA).

{noformat}
2012-07-09 15:00:22,423 ERROR [main] impl.TaskImpl (TaskImpl.java:handle(606)) 
- Can't handle this event at current state for task_1341864022346_0001_m_000001
org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: 
T_ATTEMPT_FAILED at SUCCEEDED
        at 
org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:383)
        at 
org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:298)
        at 
org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:43)
        at 
org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:443)
        at 
org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl.handle(TaskImpl.java:604)
        at 
org.apache.hadoop.mapreduce.v2.app.job.impl.TestTaskImpl.testSpeculativeTaskAttemptSucceedsEvenIfFirstFails(TestTaskImpl.java:485)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:44)
        at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
        at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:41)
        at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
        at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28)
        at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:31)
        at 
org.junit.runners.BlockJUnit4ClassRunner.runNotIgnored(BlockJUnit4ClassRunner.java:79)
        at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:71)
        at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:49)
        at org.junit.runners.ParentRunner$3.run(ParentRunner.java:193)
        at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:52)
        at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:191)
        at org.junit.runners.ParentRunner.access$000(ParentRunner.java:42)
        at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:184)
        at org.junit.runners.ParentRunner.run(ParentRunner.java:236)
        at 
org.apache.maven.surefire.junit4.JUnit4TestSet.execute(JUnit4TestSet.java:53)
        at 
org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:123)
        at 
org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:104)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at 
org.apache.maven.surefire.util.ReflectionUtils.invokeMethodWithArray(ReflectionUtils.java:164)
        at 
org.apache.maven.surefire.booter.ProviderFactory$ProviderProxy.invoke(ProviderFactory.java:110)
        at 
org.apache.maven.surefire.booter.SurefireStarter.invokeProvider(SurefireStarter.java:175)
        at 
org.apache.maven.surefire.booter.SurefireStarter.runSuitesInProcessWhenForked(SurefireStarter.java:81)
        at 
org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:68)
{noformat}

                
> MR2 job never completes with 1 pending task
> -------------------------------------------
>
>                 Key: MAPREDUCE-4252
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4252
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: mrv2
>    Affects Versions: 0.23.1
>            Reporter: Tom White
>            Assignee: Tom White
>         Attachments: MAPREDUCE-4252.patch, MAPREDUCE-4252.patch, 
> MAPREDUCE-4252.patch, MapReduce.png
>
>
> This was found by ATM:
> bq. I ran a teragen with 1000 map tasks. Many task attempts failed, but after 
> 999 of the tasks had completed, the job is now sitting forever with 1 task 
> "pending".

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to