[jira] [Commented] (YARN-299) Node Manager throws org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: RESOURCE_FAILED at DONE

2013-07-11 Thread Mayank Bansal (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13706192#comment-13706192
 ] 

Mayank Bansal commented on YARN-299:


Sure [~vinodkv]. I am reopening YARN-820 and closing this one.

Thanks,
Mayank

> Node Manager throws 
> org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: 
> RESOURCE_FAILED at DONE
> ---
>
> Key: YARN-299
> URL: https://issues.apache.org/jira/browse/YARN-299
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager
>Affects Versions: 2.0.1-alpha, 2.0.0-alpha
>Reporter: Devaraj K
>Assignee: Mayank Bansal
> Attachments: YARN-299-trunk-1.patch, YARN-299-trunk-2.patch
>
>
> {code:xml}
> 2012-12-31 10:36:27,844 WARN 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
>  Can't handle this event at current state: Current: [DONE], eventType: 
> [RESOURCE_FAILED]
> org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: 
> RESOURCE_FAILED at DONE
>   at 
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:301)
>   at 
> org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:43)
>   at 
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:443)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl.handle(ContainerImpl.java:819)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl.handle(ContainerImpl.java:71)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl$ContainerEventDispatcher.handle(ContainerManagerImpl.java:504)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl$ContainerEventDispatcher.handle(ContainerManagerImpl.java:497)
>   at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:126)
>   at 
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:75)
>   at java.lang.Thread.run(Thread.java:662)
> 2012-12-31 10:36:27,845 INFO 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
>  Container container_1356792558130_0002_01_01 transitioned from DONE to 
> null
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-299) Node Manager throws org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: RESOURCE_FAILED at DONE

2013-07-09 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13703886#comment-13703886
 ] 

Vinod Kumar Vavilapalli commented on YARN-299:
--

I still cannot come up with a theory which results in the original bug that 
[~devaraj.k] reported. Things have changed a lot via YARN-543 since this got 
opened, may be this bug disappeared completely. I can see clear fixes for 
YARN-820, why not reopen that and fix stuff there. 

> Node Manager throws 
> org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: 
> RESOURCE_FAILED at DONE
> ---
>
> Key: YARN-299
> URL: https://issues.apache.org/jira/browse/YARN-299
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager
>Affects Versions: 2.0.1-alpha, 2.0.0-alpha
>Reporter: Devaraj K
>Assignee: Mayank Bansal
> Attachments: YARN-299-trunk-1.patch, YARN-299-trunk-2.patch
>
>
> {code:xml}
> 2012-12-31 10:36:27,844 WARN 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
>  Can't handle this event at current state: Current: [DONE], eventType: 
> [RESOURCE_FAILED]
> org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: 
> RESOURCE_FAILED at DONE
>   at 
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:301)
>   at 
> org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:43)
>   at 
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:443)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl.handle(ContainerImpl.java:819)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl.handle(ContainerImpl.java:71)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl$ContainerEventDispatcher.handle(ContainerManagerImpl.java:504)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl$ContainerEventDispatcher.handle(ContainerManagerImpl.java:497)
>   at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:126)
>   at 
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:75)
>   at java.lang.Thread.run(Thread.java:662)
> 2012-12-31 10:36:27,845 INFO 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
>  Container container_1356792558130_0002_01_01 transitioned from DONE to 
> null
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-299) Node Manager throws org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: RESOURCE_FAILED at DONE

2013-07-09 Thread Omkar Vinit Joshi (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13703820#comment-13703820
 ] 

Omkar Vinit Joshi commented on YARN-299:


my bad... missed it...canceling the patch.. I think we need to remove LOCALIZED 
state transition let  [~vinodkv] confirm though ... after vinod pointed it 
out we can't get the LOCALIZED / FAILED events in DONE state at all because we 
have single dispatcher thread and all the events generated from 
LocalizedResource will get processed before Container receives 
"ContainerEventType.CONTAINER_RESOURCES_CLEANEDUP" event in 
"ContainerState.LOCALIZATION_FAILED" state.
However there is a situation in PRIVATE/APPLICATION resource localizer in which 
we notify container of failed localization if localizer thread (for that 
container) fails for some reason. check below code in 
ResourceLocalizationService.java
{code}
dispatcher.getEventHandler().handle(
new ContainerResourceFailedEvent(cId, null, e.getMessage()));
{code}
This is what happened in YARN-820 too as evident from logs.. sorry for missing 
it..
In summary we will have to keep FAILED transition but remove LOCALIZED 
transition.

> Node Manager throws 
> org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: 
> RESOURCE_FAILED at DONE
> ---
>
> Key: YARN-299
> URL: https://issues.apache.org/jira/browse/YARN-299
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager
>Affects Versions: 2.0.1-alpha, 2.0.0-alpha
>Reporter: Devaraj K
>Assignee: Mayank Bansal
> Attachments: YARN-299-trunk-1.patch, YARN-299-trunk-2.patch
>
>
> {code:xml}
> 2012-12-31 10:36:27,844 WARN 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
>  Can't handle this event at current state: Current: [DONE], eventType: 
> [RESOURCE_FAILED]
> org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: 
> RESOURCE_FAILED at DONE
>   at 
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:301)
>   at 
> org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:43)
>   at 
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:443)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl.handle(ContainerImpl.java:819)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl.handle(ContainerImpl.java:71)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl$ContainerEventDispatcher.handle(ContainerManagerImpl.java:504)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl$ContainerEventDispatcher.handle(ContainerManagerImpl.java:497)
>   at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:126)
>   at 
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:75)
>   at java.lang.Thread.run(Thread.java:662)
> 2012-12-31 10:36:27,845 INFO 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
>  Container container_1356792558130_0002_01_01 transitioned from DONE to 
> null
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-299) Node Manager throws org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: RESOURCE_FAILED at DONE

2013-07-09 Thread Omkar Vinit Joshi (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13703639#comment-13703639
 ] 

Omkar Vinit Joshi commented on YARN-299:


as a good practice it is good to grab lock inside try {} and release in 
finally{} to avoid any deadlocks due to either exception or future code 
modification.. but looks good otherwise... +1

> Node Manager throws 
> org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: 
> RESOURCE_FAILED at DONE
> ---
>
> Key: YARN-299
> URL: https://issues.apache.org/jira/browse/YARN-299
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager
>Affects Versions: 2.0.1-alpha, 2.0.0-alpha
>Reporter: Devaraj K
>Assignee: Mayank Bansal
> Attachments: YARN-299-trunk-1.patch, YARN-299-trunk-2.patch
>
>
> {code:xml}
> 2012-12-31 10:36:27,844 WARN 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
>  Can't handle this event at current state: Current: [DONE], eventType: 
> [RESOURCE_FAILED]
> org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: 
> RESOURCE_FAILED at DONE
>   at 
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:301)
>   at 
> org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:43)
>   at 
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:443)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl.handle(ContainerImpl.java:819)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl.handle(ContainerImpl.java:71)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl$ContainerEventDispatcher.handle(ContainerManagerImpl.java:504)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl$ContainerEventDispatcher.handle(ContainerManagerImpl.java:497)
>   at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:126)
>   at 
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:75)
>   at java.lang.Thread.run(Thread.java:662)
> 2012-12-31 10:36:27,845 INFO 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
>  Container container_1356792558130_0002_01_01 transitioned from DONE to 
> null
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-299) Node Manager throws org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: RESOURCE_FAILED at DONE

2013-07-08 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13702918#comment-13702918
 ] 

Hadoop QA commented on YARN-299:


{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12591378/YARN-299-trunk-2.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/1437//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/1437//console

This message is automatically generated.

> Node Manager throws 
> org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: 
> RESOURCE_FAILED at DONE
> ---
>
> Key: YARN-299
> URL: https://issues.apache.org/jira/browse/YARN-299
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager
>Affects Versions: 2.0.1-alpha, 2.0.0-alpha
>Reporter: Devaraj K
>Assignee: Mayank Bansal
> Attachments: YARN-299-trunk-1.patch, YARN-299-trunk-2.patch
>
>
> {code:xml}
> 2012-12-31 10:36:27,844 WARN 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
>  Can't handle this event at current state: Current: [DONE], eventType: 
> [RESOURCE_FAILED]
> org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: 
> RESOURCE_FAILED at DONE
>   at 
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:301)
>   at 
> org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:43)
>   at 
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:443)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl.handle(ContainerImpl.java:819)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl.handle(ContainerImpl.java:71)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl$ContainerEventDispatcher.handle(ContainerManagerImpl.java:504)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl$ContainerEventDispatcher.handle(ContainerManagerImpl.java:497)
>   at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:126)
>   at 
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:75)
>   at java.lang.Thread.run(Thread.java:662)
> 2012-12-31 10:36:27,845 INFO 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
>  Container container_1356792558130_0002_01_01 transitioned from DONE to 
> null
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-299) Node Manager throws org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: RESOURCE_FAILED at DONE

2013-07-08 Thread Mayank Bansal (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13702897#comment-13702897
 ] 

Mayank Bansal commented on YARN-299:


Thanks [~ojoshi] for the review.

I think make sense for RESOURCE_LOCALIZED and added that.

Ideally tostring changes should be part of different jira however its so 
trivial so did this in this JIRA.

Updating the patch.

Thanks,
Mayank

> Node Manager throws 
> org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: 
> RESOURCE_FAILED at DONE
> ---
>
> Key: YARN-299
> URL: https://issues.apache.org/jira/browse/YARN-299
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager
>Affects Versions: 2.0.1-alpha, 2.0.0-alpha
>Reporter: Devaraj K
>Assignee: Mayank Bansal
> Attachments: YARN-299-trunk-1.patch
>
>
> {code:xml}
> 2012-12-31 10:36:27,844 WARN 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
>  Can't handle this event at current state: Current: [DONE], eventType: 
> [RESOURCE_FAILED]
> org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: 
> RESOURCE_FAILED at DONE
>   at 
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:301)
>   at 
> org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:43)
>   at 
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:443)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl.handle(ContainerImpl.java:819)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl.handle(ContainerImpl.java:71)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl$ContainerEventDispatcher.handle(ContainerManagerImpl.java:504)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl$ContainerEventDispatcher.handle(ContainerManagerImpl.java:497)
>   at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:126)
>   at 
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:75)
>   at java.lang.Thread.run(Thread.java:662)
> 2012-12-31 10:36:27,845 INFO 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
>  Container container_1356792558130_0002_01_01 transitioned from DONE to 
> null
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-299) Node Manager throws org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: RESOURCE_FAILED at DONE

2013-07-08 Thread Omkar Vinit Joshi (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13702456#comment-13702456
 ] 

Omkar Vinit Joshi commented on YARN-299:


I guess the patch looks good overall .. however we need an additional fix which 
might also occur. The root cause for this is more evident in YARN-820 logs.. 
Container is requesting multiple resources and RESOURCE_LOCALIZED / 
RESOURCE_FAILED events might occur for one more more resources between 
container received first RESOURCE_FAILED event and it deregister itself from 
remaining resources...therefore we might see RESOURCE_FAILED / 
RESOURCE_LOCALIZED events sent to containerImpl when resource is in DONE state 
(for different resources) Therefore like RESOURCE_FAILED we should also 
ignore RESOURCE_LOCALIZED event.
I could see one more issue in the logs... it would be great if we fix that too 
as a part of this jira looks like a quick change... here in LOG.info it is 
calling toString on LocalizedResource which is not threadsafe for ref 
(LinkedList used internally). I guess grabbing writelock inside toString will 
protect it from such exceptions.. we need to check other state machines as well.

{code}
} catch (ExecutionException e) {
  LOG.info("Failed to download rsrc " + assoc.getResource(),
  e.getCause());
  LocalResourceRequest req = assoc.getResource().getRequest();
  publicRsrc.handle(new ResourceFailedLocalizationEvent(req,
  e.getMessage()));
  assoc.getResource().unlock();
{code}

any thoughts?

> Node Manager throws 
> org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: 
> RESOURCE_FAILED at DONE
> ---
>
> Key: YARN-299
> URL: https://issues.apache.org/jira/browse/YARN-299
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager
>Affects Versions: 2.0.1-alpha, 2.0.0-alpha
>Reporter: Devaraj K
>Assignee: Mayank Bansal
> Attachments: YARN-299-trunk-1.patch
>
>
> {code:xml}
> 2012-12-31 10:36:27,844 WARN 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
>  Can't handle this event at current state: Current: [DONE], eventType: 
> [RESOURCE_FAILED]
> org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: 
> RESOURCE_FAILED at DONE
>   at 
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:301)
>   at 
> org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:43)
>   at 
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:443)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl.handle(ContainerImpl.java:819)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl.handle(ContainerImpl.java:71)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl$ContainerEventDispatcher.handle(ContainerManagerImpl.java:504)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl$ContainerEventDispatcher.handle(ContainerManagerImpl.java:497)
>   at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:126)
>   at 
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:75)
>   at java.lang.Thread.run(Thread.java:662)
> 2012-12-31 10:36:27,845 INFO 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
>  Container container_1356792558130_0002_01_01 transitioned from DONE to 
> null
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-299) Node Manager throws org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: RESOURCE_FAILED at DONE

2013-07-02 Thread Mayank Bansal (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13698414#comment-13698414
 ] 

Mayank Bansal commented on YARN-299:


This patch does not need rebasing

Thanks,
Mayank

> Node Manager throws 
> org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: 
> RESOURCE_FAILED at DONE
> ---
>
> Key: YARN-299
> URL: https://issues.apache.org/jira/browse/YARN-299
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager
>Affects Versions: 2.0.1-alpha, 2.0.0-alpha
>Reporter: Devaraj K
>Assignee: Mayank Bansal
> Attachments: YARN-299-trunk-1.patch
>
>
> {code:xml}
> 2012-12-31 10:36:27,844 WARN 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
>  Can't handle this event at current state: Current: [DONE], eventType: 
> [RESOURCE_FAILED]
> org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: 
> RESOURCE_FAILED at DONE
>   at 
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:301)
>   at 
> org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:43)
>   at 
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:443)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl.handle(ContainerImpl.java:819)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl.handle(ContainerImpl.java:71)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl$ContainerEventDispatcher.handle(ContainerManagerImpl.java:504)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl$ContainerEventDispatcher.handle(ContainerManagerImpl.java:497)
>   at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:126)
>   at 
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:75)
>   at java.lang.Thread.run(Thread.java:662)
> 2012-12-31 10:36:27,845 INFO 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
>  Container container_1356792558130_0002_01_01 transitioned from DONE to 
> null
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-299) Node Manager throws org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: RESOURCE_FAILED at DONE

2013-06-10 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13679940#comment-13679940
 ] 

Hadoop QA commented on YARN-299:


{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12587131/YARN-299-trunk-1.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/1182//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/1182//console

This message is automatically generated.

> Node Manager throws 
> org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: 
> RESOURCE_FAILED at DONE
> ---
>
> Key: YARN-299
> URL: https://issues.apache.org/jira/browse/YARN-299
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager
>Affects Versions: 2.0.1-alpha, 2.0.0-alpha
>Reporter: Devaraj K
>Assignee: Mayank Bansal
> Attachments: YARN-299-trunk-1.patch
>
>
> {code:xml}
> 2012-12-31 10:36:27,844 WARN 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
>  Can't handle this event at current state: Current: [DONE], eventType: 
> [RESOURCE_FAILED]
> org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: 
> RESOURCE_FAILED at DONE
>   at 
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:301)
>   at 
> org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:43)
>   at 
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:443)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl.handle(ContainerImpl.java:819)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl.handle(ContainerImpl.java:71)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl$ContainerEventDispatcher.handle(ContainerManagerImpl.java:504)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl$ContainerEventDispatcher.handle(ContainerManagerImpl.java:497)
>   at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:126)
>   at 
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:75)
>   at java.lang.Thread.run(Thread.java:662)
> 2012-12-31 10:36:27,845 INFO 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
>  Container container_1356792558130_0002_01_01 transitioned from DONE to 
> null
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-299) Node Manager throws org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: RESOURCE_FAILED at DONE

2013-06-07 Thread Mayank Bansal (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13678437#comment-13678437
 ] 

Mayank Bansal commented on YARN-299:


Looks like there is a race condition here when container is killed during 
localization process. LocalizeRunner will send RESOURCE_FAILED as killed 
container is trying to fetch the resources from the already cleanedup 
directories. 

In the mean time Contained is killed and after cleanup its reached to Done 
state.

Thanks,
Mayank

> Node Manager throws 
> org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: 
> RESOURCE_FAILED at DONE
> ---
>
> Key: YARN-299
> URL: https://issues.apache.org/jira/browse/YARN-299
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager
>Affects Versions: 2.0.1-alpha, 2.0.0-alpha
>Reporter: Devaraj K
>Assignee: Mayank Bansal
>
> {code:xml}
> 2012-12-31 10:36:27,844 WARN 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
>  Can't handle this event at current state: Current: [DONE], eventType: 
> [RESOURCE_FAILED]
> org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: 
> RESOURCE_FAILED at DONE
>   at 
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:301)
>   at 
> org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:43)
>   at 
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:443)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl.handle(ContainerImpl.java:819)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl.handle(ContainerImpl.java:71)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl$ContainerEventDispatcher.handle(ContainerManagerImpl.java:504)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl$ContainerEventDispatcher.handle(ContainerManagerImpl.java:497)
>   at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:126)
>   at 
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:75)
>   at java.lang.Thread.run(Thread.java:662)
> 2012-12-31 10:36:27,845 INFO 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
>  Container container_1356792558130_0002_01_01 transitioned from DONE to 
> null
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-299) Node Manager throws org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: RESOURCE_FAILED at DONE

2013-03-03 Thread Devaraj K (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13591976#comment-13591976
 ] 

Devaraj K commented on YARN-299:


It is not occuring every time, it appeared in the logs during long run.

> Node Manager throws 
> org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: 
> RESOURCE_FAILED at DONE
> ---
>
> Key: YARN-299
> URL: https://issues.apache.org/jira/browse/YARN-299
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.0.1-alpha, 2.0.0-alpha
>Reporter: Devaraj K
>
> {code:xml}
> 2012-12-31 10:36:27,844 WARN 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
>  Can't handle this event at current state: Current: [DONE], eventType: 
> [RESOURCE_FAILED]
> org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: 
> RESOURCE_FAILED at DONE
>   at 
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:301)
>   at 
> org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:43)
>   at 
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:443)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl.handle(ContainerImpl.java:819)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl.handle(ContainerImpl.java:71)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl$ContainerEventDispatcher.handle(ContainerManagerImpl.java:504)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl$ContainerEventDispatcher.handle(ContainerManagerImpl.java:497)
>   at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:126)
>   at 
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:75)
>   at java.lang.Thread.run(Thread.java:662)
> 2012-12-31 10:36:27,845 INFO 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
>  Container container_1356792558130_0002_01_01 transitioned from DONE to 
> null
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-299) Node Manager throws org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: RESOURCE_FAILED at DONE

2013-03-01 Thread Abhishek Kapoor (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13590504#comment-13590504
 ] 

Abhishek Kapoor commented on YARN-299:
--

Please describe the steps to replicate the issue..


> Node Manager throws 
> org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: 
> RESOURCE_FAILED at DONE
> ---
>
> Key: YARN-299
> URL: https://issues.apache.org/jira/browse/YARN-299
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.0.1-alpha, 2.0.0-alpha
>Reporter: Devaraj K
>
> {code:xml}
> 2012-12-31 10:36:27,844 WARN 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
>  Can't handle this event at current state: Current: [DONE], eventType: 
> [RESOURCE_FAILED]
> org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: 
> RESOURCE_FAILED at DONE
>   at 
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:301)
>   at 
> org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:43)
>   at 
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:443)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl.handle(ContainerImpl.java:819)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl.handle(ContainerImpl.java:71)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl$ContainerEventDispatcher.handle(ContainerManagerImpl.java:504)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl$ContainerEventDispatcher.handle(ContainerManagerImpl.java:497)
>   at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:126)
>   at 
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:75)
>   at java.lang.Thread.run(Thread.java:662)
> 2012-12-31 10:36:27,845 INFO 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
>  Container container_1356792558130_0002_01_01 transitioned from DONE to 
> null
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira