[jira] [Commented] (YARN-966) The thread of ContainerLaunch#call will fail without any signal if getLocalizedResources() is called when the container is not at LOCALIZED

2013-07-31 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13725099#comment-13725099
 ] 

Hudson commented on YARN-966:
-

SUCCESS: Integrated in Hadoop-Yarn-trunk #287 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/287/])
YARN-966. Fixed ContainerLaunch to not fail quietly when there are no localized 
resources due to some other failure. Contributed by Zhijie Shen. (vinodkv: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1508688)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/container/ContainerImpl.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/launcher/ContainerLaunch.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/container/TestContainer.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/launcher/TestContainerLaunch.java


 The thread of ContainerLaunch#call will fail without any signal if 
 getLocalizedResources() is called when the container is not at LOCALIZED
 ---

 Key: YARN-966
 URL: https://issues.apache.org/jira/browse/YARN-966
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Zhijie Shen
Assignee: Zhijie Shen
 Fix For: 2.1.1-beta

 Attachments: YARN-966.1.patch


 In ContainerImpl.getLocalizedResources(), there's:
 {code}
 assert ContainerState.LOCALIZED == getContainerState(); // TODO: FIXME!!
 {code}
 ContainerImpl.getLocalizedResources() is called in ContainerLaunch.call(), 
 which is scheduled on a separate thread. If the container is not at LOCALIZED 
 (e.g. it is at KILLING, see YARN-906), an AssertError will be thrown and 
 fails the thread without notifying NM. Therefore, the container cannot 
 receive more events, which are supposed to be sent from 
 ContainerLaunch.call(), and move towards completion. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-966) The thread of ContainerLaunch#call will fail without any signal if getLocalizedResources() is called when the container is not at LOCALIZED

2013-07-31 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13725211#comment-13725211
 ] 

Hudson commented on YARN-966:
-

FAILURE: Integrated in Hadoop-Hdfs-trunk #1477 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1477/])
YARN-966. Fixed ContainerLaunch to not fail quietly when there are no localized 
resources due to some other failure. Contributed by Zhijie Shen. (vinodkv: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1508688)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/container/ContainerImpl.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/launcher/ContainerLaunch.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/container/TestContainer.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/launcher/TestContainerLaunch.java


 The thread of ContainerLaunch#call will fail without any signal if 
 getLocalizedResources() is called when the container is not at LOCALIZED
 ---

 Key: YARN-966
 URL: https://issues.apache.org/jira/browse/YARN-966
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Zhijie Shen
Assignee: Zhijie Shen
 Fix For: 2.1.1-beta

 Attachments: YARN-966.1.patch


 In ContainerImpl.getLocalizedResources(), there's:
 {code}
 assert ContainerState.LOCALIZED == getContainerState(); // TODO: FIXME!!
 {code}
 ContainerImpl.getLocalizedResources() is called in ContainerLaunch.call(), 
 which is scheduled on a separate thread. If the container is not at LOCALIZED 
 (e.g. it is at KILLING, see YARN-906), an AssertError will be thrown and 
 fails the thread without notifying NM. Therefore, the container cannot 
 receive more events, which are supposed to be sent from 
 ContainerLaunch.call(), and move towards completion. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-966) The thread of ContainerLaunch#call will fail without any signal if getLocalizedResources() is called when the container is not at LOCALIZED

2013-07-31 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13725275#comment-13725275
 ] 

Hudson commented on YARN-966:
-

SUCCESS: Integrated in Hadoop-Mapreduce-trunk #1504 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1504/])
YARN-966. Fixed ContainerLaunch to not fail quietly when there are no localized 
resources due to some other failure. Contributed by Zhijie Shen. (vinodkv: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1508688)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/container/ContainerImpl.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/launcher/ContainerLaunch.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/container/TestContainer.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/launcher/TestContainerLaunch.java


 The thread of ContainerLaunch#call will fail without any signal if 
 getLocalizedResources() is called when the container is not at LOCALIZED
 ---

 Key: YARN-966
 URL: https://issues.apache.org/jira/browse/YARN-966
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Zhijie Shen
Assignee: Zhijie Shen
 Fix For: 2.1.1-beta

 Attachments: YARN-966.1.patch


 In ContainerImpl.getLocalizedResources(), there's:
 {code}
 assert ContainerState.LOCALIZED == getContainerState(); // TODO: FIXME!!
 {code}
 ContainerImpl.getLocalizedResources() is called in ContainerLaunch.call(), 
 which is scheduled on a separate thread. If the container is not at LOCALIZED 
 (e.g. it is at KILLING, see YARN-906), an AssertError will be thrown and 
 fails the thread without notifying NM. Therefore, the container cannot 
 receive more events, which are supposed to be sent from 
 ContainerLaunch.call(), and move towards completion. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-966) The thread of ContainerLaunch#call will fail without any signal if getLocalizedResources() is called when the container is not at LOCALIZED

2013-07-31 Thread Omkar Vinit Joshi (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13725483#comment-13725483
 ] 

Omkar Vinit Joshi commented on YARN-966:


bq. One more consideration. Empty map can means the case that the container is 
at LOCALIZED, but actually there's no localized resources. Returning null is to 
distinguish this case with the case of fetch the localized resources when the 
container is not at LOCALIZED.
This assumption is wrong. State of container has nothing to do with localized 
resources map. we can call getState and know its state irrespective of this 
null check. Potentially I don't see when we will in fact start 
ContainerLaunch#call without its all resources getting downloaded. Its 
different issue that user  may kill the container resulting into a state 
transition. This I still see should not be done via NULL check. Proper way is 
to set boolean flag of ContainerLaunch in the event of KILL synchronously. 

 The thread of ContainerLaunch#call will fail without any signal if 
 getLocalizedResources() is called when the container is not at LOCALIZED
 ---

 Key: YARN-966
 URL: https://issues.apache.org/jira/browse/YARN-966
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Zhijie Shen
Assignee: Zhijie Shen
 Fix For: 2.1.1-beta

 Attachments: YARN-966.1.patch


 In ContainerImpl.getLocalizedResources(), there's:
 {code}
 assert ContainerState.LOCALIZED == getContainerState(); // TODO: FIXME!!
 {code}
 ContainerImpl.getLocalizedResources() is called in ContainerLaunch.call(), 
 which is scheduled on a separate thread. If the container is not at LOCALIZED 
 (e.g. it is at KILLING, see YARN-906), an AssertError will be thrown and 
 fails the thread without notifying NM. Therefore, the container cannot 
 receive more events, which are supposed to be sent from 
 ContainerLaunch.call(), and move towards completion. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-966) The thread of ContainerLaunch#call will fail without any signal if getLocalizedResources() is called when the container is not at LOCALIZED

2013-07-31 Thread Omkar Vinit Joshi (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13725489#comment-13725489
 ] 

Omkar Vinit Joshi commented on YARN-966:


So if user say kills the container what error we will see is 
{code}
+RPCUtil.getRemoteException(
+Unable to get local resources when Container  + containerID +
+ is at  + container.getContainerState());
{code}
which is completely misleading.. Indeed this occurred because user killed 
container not because it failed to localize resources.

 The thread of ContainerLaunch#call will fail without any signal if 
 getLocalizedResources() is called when the container is not at LOCALIZED
 ---

 Key: YARN-966
 URL: https://issues.apache.org/jira/browse/YARN-966
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Zhijie Shen
Assignee: Zhijie Shen
 Fix For: 2.1.1-beta

 Attachments: YARN-966.1.patch


 In ContainerImpl.getLocalizedResources(), there's:
 {code}
 assert ContainerState.LOCALIZED == getContainerState(); // TODO: FIXME!!
 {code}
 ContainerImpl.getLocalizedResources() is called in ContainerLaunch.call(), 
 which is scheduled on a separate thread. If the container is not at LOCALIZED 
 (e.g. it is at KILLING, see YARN-906), an AssertError will be thrown and 
 fails the thread without notifying NM. Therefore, the container cannot 
 receive more events, which are supposed to be sent from 
 ContainerLaunch.call(), and move towards completion. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-966) The thread of ContainerLaunch#call will fail without any signal if getLocalizedResources() is called when the container is not at LOCALIZED

2013-07-31 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13725518#comment-13725518
 ] 

Vinod Kumar Vavilapalli commented on YARN-966:
--

bq. Potentially I don't see when we will in fact start ContainerLaunch#call 
without its all resources getting downloaded.
This is the most important point.

bq. This I still see should not be done via NULL check. Proper way is to set 
boolean flag of ContainerLaunch in the event of KILL synchronously. 
bq. which is completely misleading.. Indeed this occurred because user killed 
container not because it failed to localize resources.
I think we are beating this down to death. Like I said, this error SHOULD NOT 
happen in practice. I don't know whey the assert was originally put in place. 
That said, I didn't want to blindly remove it without knowing why it was there 
to begin with. If ever we run into this in real life, we can fix the message.

 The thread of ContainerLaunch#call will fail without any signal if 
 getLocalizedResources() is called when the container is not at LOCALIZED
 ---

 Key: YARN-966
 URL: https://issues.apache.org/jira/browse/YARN-966
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Zhijie Shen
Assignee: Zhijie Shen
 Fix For: 2.1.1-beta

 Attachments: YARN-966.1.patch


 In ContainerImpl.getLocalizedResources(), there's:
 {code}
 assert ContainerState.LOCALIZED == getContainerState(); // TODO: FIXME!!
 {code}
 ContainerImpl.getLocalizedResources() is called in ContainerLaunch.call(), 
 which is scheduled on a separate thread. If the container is not at LOCALIZED 
 (e.g. it is at KILLING, see YARN-906), an AssertError will be thrown and 
 fails the thread without notifying NM. Therefore, the container cannot 
 receive more events, which are supposed to be sent from 
 ContainerLaunch.call(), and move towards completion. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-966) The thread of ContainerLaunch#call will fail without any signal if getLocalizedResources() is called when the container is not at LOCALIZED

2013-07-31 Thread Zhijie Shen (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13725526#comment-13725526
 ] 

Zhijie Shen commented on YARN-966:
--

bq. Potentially I don't see when we will in fact start ContainerLaunch#call 
without its all resources getting downloaded.

YARN-906 is such a corner case.

bq. This I still see should not be done via NULL check. Proper way is to set 
boolean flag of ContainerLaunch in the event of KILL synchronously.

The original code checks state == LOCALIZED, and throws AssertError when 
getting the localized resources. I just modified the way to indicate the error, 
such that the callers of it can more easily handle the error. If you think 
calling getLocalizedResources() when the container is not at LOCALIZED is not 
wrong, I'm afraid we're in the different conversation.

bq. which is completely misleading.. Indeed this occurred because user killed 
container not because it failed to localize resources.

I don't think the message is misleading. Again, getLocalizedResources() is not 
allowed to be called when the container is not at LOCALIZED (at least the 
original code means it). So the message clearly states problem. Please note 
that killing signal is not the root problem of the thread failure here. If 
getLocalizedResources() were not called, the thread would still complete 
without exception. 
 

 The thread of ContainerLaunch#call will fail without any signal if 
 getLocalizedResources() is called when the container is not at LOCALIZED
 ---

 Key: YARN-966
 URL: https://issues.apache.org/jira/browse/YARN-966
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Zhijie Shen
Assignee: Zhijie Shen
 Fix For: 2.1.1-beta

 Attachments: YARN-966.1.patch


 In ContainerImpl.getLocalizedResources(), there's:
 {code}
 assert ContainerState.LOCALIZED == getContainerState(); // TODO: FIXME!!
 {code}
 ContainerImpl.getLocalizedResources() is called in ContainerLaunch.call(), 
 which is scheduled on a separate thread. If the container is not at LOCALIZED 
 (e.g. it is at KILLING, see YARN-906), an AssertError will be thrown and 
 fails the thread without notifying NM. Therefore, the container cannot 
 receive more events, which are supposed to be sent from 
 ContainerLaunch.call(), and move towards completion. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-966) The thread of ContainerLaunch#call will fail without any signal if getLocalizedResources() is called when the container is not at LOCALIZED

2013-07-30 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13724621#comment-13724621
 ] 

Vinod Kumar Vavilapalli commented on YARN-966:
--

Looks straight-forward. +1. Checking this in.

 The thread of ContainerLaunch#call will fail without any signal if 
 getLocalizedResources() is called when the container is not at LOCALIZED
 ---

 Key: YARN-966
 URL: https://issues.apache.org/jira/browse/YARN-966
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Zhijie Shen
Assignee: Zhijie Shen
 Attachments: YARN-966.1.patch


 In ContainerImpl.getLocalizedResources(), there's:
 {code}
 assert ContainerState.LOCALIZED == getContainerState(); // TODO: FIXME!!
 {code}
 ContainerImpl.getLocalizedResources() is called in ContainerLaunch.call(), 
 which is scheduled on a separate thread. If the container is not at LOCALIZED 
 (e.g. it is at KILLING, see YARN-906), an AssertError will be thrown and 
 fails the thread without notifying NM. Therefore, the container cannot 
 receive more events, which are supposed to be sent from 
 ContainerLaunch.call(), and move towards completion. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-966) The thread of ContainerLaunch#call will fail without any signal if getLocalizedResources() is called when the container is not at LOCALIZED

2013-07-30 Thread Omkar Vinit Joshi (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13724631#comment-13724631
 ] 

Omkar Vinit Joshi commented on YARN-966:


few questions..
{code}
-assert ContainerState.LOCALIZED == getContainerState(); // TODO: FIXME!!
-return localizedResources;
+  if (ContainerState.LOCALIZED == getContainerState()) {
+return localizedResources;
+  } else {
+return null;
+  }
{code}
after this change are we saying that container's partial localized resources 
will not be accessible during the state other than LOCALIZED and we will return 
null??

 The thread of ContainerLaunch#call will fail without any signal if 
 getLocalizedResources() is called when the container is not at LOCALIZED
 ---

 Key: YARN-966
 URL: https://issues.apache.org/jira/browse/YARN-966
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Zhijie Shen
Assignee: Zhijie Shen
 Attachments: YARN-966.1.patch


 In ContainerImpl.getLocalizedResources(), there's:
 {code}
 assert ContainerState.LOCALIZED == getContainerState(); // TODO: FIXME!!
 {code}
 ContainerImpl.getLocalizedResources() is called in ContainerLaunch.call(), 
 which is scheduled on a separate thread. If the container is not at LOCALIZED 
 (e.g. it is at KILLING, see YARN-906), an AssertError will be thrown and 
 fails the thread without notifying NM. Therefore, the container cannot 
 receive more events, which are supposed to be sent from 
 ContainerLaunch.call(), and move towards completion. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-966) The thread of ContainerLaunch#call will fail without any signal if getLocalizedResources() is called when the container is not at LOCALIZED

2013-07-30 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13724638#comment-13724638
 ] 

Hudson commented on YARN-966:
-

SUCCESS: Integrated in Hadoop-trunk-Commit #4190 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/4190/])
YARN-966. Fixed ContainerLaunch to not fail quietly when there are no localized 
resources due to some other failure. Contributed by Zhijie Shen. (vinodkv: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1508688)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/container/ContainerImpl.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/launcher/ContainerLaunch.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/container/TestContainer.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/launcher/TestContainerLaunch.java


 The thread of ContainerLaunch#call will fail without any signal if 
 getLocalizedResources() is called when the container is not at LOCALIZED
 ---

 Key: YARN-966
 URL: https://issues.apache.org/jira/browse/YARN-966
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Zhijie Shen
Assignee: Zhijie Shen
 Attachments: YARN-966.1.patch


 In ContainerImpl.getLocalizedResources(), there's:
 {code}
 assert ContainerState.LOCALIZED == getContainerState(); // TODO: FIXME!!
 {code}
 ContainerImpl.getLocalizedResources() is called in ContainerLaunch.call(), 
 which is scheduled on a separate thread. If the container is not at LOCALIZED 
 (e.g. it is at KILLING, see YARN-906), an AssertError will be thrown and 
 fails the thread without notifying NM. Therefore, the container cannot 
 receive more events, which are supposed to be sent from 
 ContainerLaunch.call(), and move towards completion. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-966) The thread of ContainerLaunch#call will fail without any signal if getLocalizedResources() is called when the container is not at LOCALIZED

2013-07-30 Thread Omkar Vinit Joshi (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13724642#comment-13724642
 ] 

Omkar Vinit Joshi commented on YARN-966:


* Also I think we are mixing YARN-966 with YARN-906. I don't see any point why 
we should return null...we can return empty map if no resources are localized.. 
thoughts?
{code}
  private final MapPath,ListString localizedResources =
new HashMapPath,ListString();
{code}

* I guess we should not fix that TODO. that has nothing to do with 
getLocalizedResources() call. Probably caller can make getState() call and 
figure out in what state container isthoughts?


 The thread of ContainerLaunch#call will fail without any signal if 
 getLocalizedResources() is called when the container is not at LOCALIZED
 ---

 Key: YARN-966
 URL: https://issues.apache.org/jira/browse/YARN-966
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Zhijie Shen
Assignee: Zhijie Shen
 Attachments: YARN-966.1.patch


 In ContainerImpl.getLocalizedResources(), there's:
 {code}
 assert ContainerState.LOCALIZED == getContainerState(); // TODO: FIXME!!
 {code}
 ContainerImpl.getLocalizedResources() is called in ContainerLaunch.call(), 
 which is scheduled on a separate thread. If the container is not at LOCALIZED 
 (e.g. it is at KILLING, see YARN-906), an AssertError will be thrown and 
 fails the thread without notifying NM. Therefore, the container cannot 
 receive more events, which are supposed to be sent from 
 ContainerLaunch.call(), and move towards completion. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-966) The thread of ContainerLaunch#call will fail without any signal if getLocalizedResources() is called when the container is not at LOCALIZED

2013-07-30 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13724694#comment-13724694
 ] 

Vinod Kumar Vavilapalli commented on YARN-966:
--

bq. after this change are we saying that container's partial localized 
resources will not be accessible during the state other than LOCALIZED and we 
will return null??
There is only one user of the API today. We can change it in future.

bq. Also I think we are mixing YARN-966 with YARN-906. I don't see any point 
why we should return null...we can return empty map if no resources are 
localized.. thoughts?
We aren't mixing YARN-966. We are just fixing an assert to be explicitly 
checked so that IN case something happens, we can report properly. In all my 
understanding, this code path should never get triggered.

 The thread of ContainerLaunch#call will fail without any signal if 
 getLocalizedResources() is called when the container is not at LOCALIZED
 ---

 Key: YARN-966
 URL: https://issues.apache.org/jira/browse/YARN-966
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Zhijie Shen
Assignee: Zhijie Shen
 Fix For: 2.1.1-beta

 Attachments: YARN-966.1.patch


 In ContainerImpl.getLocalizedResources(), there's:
 {code}
 assert ContainerState.LOCALIZED == getContainerState(); // TODO: FIXME!!
 {code}
 ContainerImpl.getLocalizedResources() is called in ContainerLaunch.call(), 
 which is scheduled on a separate thread. If the container is not at LOCALIZED 
 (e.g. it is at KILLING, see YARN-906), an AssertError will be thrown and 
 fails the thread without notifying NM. Therefore, the container cannot 
 receive more events, which are supposed to be sent from 
 ContainerLaunch.call(), and move towards completion. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-966) The thread of ContainerLaunch#call will fail without any signal if getLocalizedResources() is called when the container is not at LOCALIZED

2013-07-30 Thread Zhijie Shen (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13724704#comment-13724704
 ] 

Zhijie Shen commented on YARN-966:
--

bq. Also I think we are mixing YARN-966 with YARN-906. I don't see any point 
why we should return null...we can return empty map if no resources are 
localized.. thoughts?

One more consideration. Empty map can means the case that the container is at 
LOCALIZED, but actually there's no localized resources. Returning null is to 
distinguish this case with the case of fetch the localized resources when the 
container is not at LOCALIZED.

 The thread of ContainerLaunch#call will fail without any signal if 
 getLocalizedResources() is called when the container is not at LOCALIZED
 ---

 Key: YARN-966
 URL: https://issues.apache.org/jira/browse/YARN-966
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Zhijie Shen
Assignee: Zhijie Shen
 Fix For: 2.1.1-beta

 Attachments: YARN-966.1.patch


 In ContainerImpl.getLocalizedResources(), there's:
 {code}
 assert ContainerState.LOCALIZED == getContainerState(); // TODO: FIXME!!
 {code}
 ContainerImpl.getLocalizedResources() is called in ContainerLaunch.call(), 
 which is scheduled on a separate thread. If the container is not at LOCALIZED 
 (e.g. it is at KILLING, see YARN-906), an AssertError will be thrown and 
 fails the thread without notifying NM. Therefore, the container cannot 
 receive more events, which are supposed to be sent from 
 ContainerLaunch.call(), and move towards completion. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-966) The thread of ContainerLaunch#call will fail without any signal if getLocalizedResources() is called when the container is not at LOCALIZED

2013-07-24 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13718138#comment-13718138
 ] 

Hadoop QA commented on YARN-966:


{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12593887/YARN-966.1.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/1564//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/1564//console

This message is automatically generated.

 The thread of ContainerLaunch#call will fail without any signal if 
 getLocalizedResources() is called when the container is not at LOCALIZED
 ---

 Key: YARN-966
 URL: https://issues.apache.org/jira/browse/YARN-966
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Zhijie Shen
Assignee: Zhijie Shen
 Attachments: YARN-966.1.patch


 In ContainerImpl.getLocalizedResources(), there's:
 {code}
 assert ContainerState.LOCALIZED == getContainerState(); // TODO: FIXME!!
 {code}
 ContainerImpl.getLocalizedResources() is called in ContainerLaunch.call(), 
 which is scheduled on a separate thread. If the container is not at LOCALIZED 
 (e.g. it is at KILLING, see YARN-906), an AssertError will be thrown and 
 fails the thread without notifying NM. Therefore, the container cannot 
 receive more events, which are supposed to be sent from 
 ContainerLaunch.call(), and move towards completion. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-966) The thread of ContainerLaunch#call will fail without any signal if getLocalizedResources() is called when the container is not at LOCALIZED

2013-07-24 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13718194#comment-13718194
 ] 

Hadoop QA commented on YARN-966:


{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12593887/YARN-966.1.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/1572//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/1572//console

This message is automatically generated.

 The thread of ContainerLaunch#call will fail without any signal if 
 getLocalizedResources() is called when the container is not at LOCALIZED
 ---

 Key: YARN-966
 URL: https://issues.apache.org/jira/browse/YARN-966
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Zhijie Shen
Assignee: Zhijie Shen
 Attachments: YARN-966.1.patch


 In ContainerImpl.getLocalizedResources(), there's:
 {code}
 assert ContainerState.LOCALIZED == getContainerState(); // TODO: FIXME!!
 {code}
 ContainerImpl.getLocalizedResources() is called in ContainerLaunch.call(), 
 which is scheduled on a separate thread. If the container is not at LOCALIZED 
 (e.g. it is at KILLING, see YARN-906), an AssertError will be thrown and 
 fails the thread without notifying NM. Therefore, the container cannot 
 receive more events, which are supposed to be sent from 
 ContainerLaunch.call(), and move towards completion. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira