[ 
https://issues.apache.org/jira/browse/YARN-562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13641308#comment-13641308
 ] 

Bikas Saha commented on YARN-562:
---------------------------------

Shouldnt the new exception be inheriting from YarnException, the common base 
class?
I actually like NMNotConnectedWithRMException because NotYetReady could be due 
to various other reasons. No strong opinion.
Is there an existing InvalidContainerException for cases when ContainerToken is 
invalid? How about InvalidContainerException as a name. If the only thing the 
client can do is get a new container from the RM then there may not be any 
point in differentiating the reasons. If we really want to keep RM in the name 
then maybe InvalidContainerFromUnknownRM. Previous may not be correct.

I think the invalidation need to be done before sending the event because 
technically this thread could be suspended immediately after sending the event. 
So the handler thread could run before the invalidation happens.
{code}
               dispatcher.getEventHandler().handle(
                   new NodeManagerEvent(NodeManagerEventType.RESYNC));
+              // Invalidate the RMIdentifier while resync
+              setRMIdentifier(ResourceManagerConstants.RM_INVALID_IDENTIFIER);
               break;
{code}

Reads weird that container manager is notifying itself.
{code}
+
+    LOG.info("Notifying ContainerManager to block new container-requests as " +
+               "NodeManager is still starting.");
+    this.setBlockNewContainerRequests(true);
{code}

Would be good to continue looping until notified that the containermanager is 
no longer blocked.
{code}
+            try {
               // HERE set FLAG to stop thread
+              launchContainersThread.join();
+              super.setBlockNewContainerRequests(blockNewContainerRequests);
....
+        try {
           // HERE check FLAG to stop thread
+          while (numContainers++ < 10) {
{code}
                
> NM should reject containers allocated by previous RM
> ----------------------------------------------------
>
>                 Key: YARN-562
>                 URL: https://issues.apache.org/jira/browse/YARN-562
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: resourcemanager
>            Reporter: Jian He
>            Assignee: Jian He
>         Attachments: YARN-562.10.patch, YARN-562.1.patch, YARN-562.2.patch, 
> YARN-562.3.patch, YARN-562.4.patch, YARN-562.5.patch, YARN-562.6.patch, 
> YARN-562.7.patch, YARN-562.8.patch, YARN-562.9.patch
>
>
> Its possible that after RM shutdown, before AM goes down,AM still call 
> startContainer on NM with containers allocated by previous RM. When RM comes 
> back, NM doesn't know whether this container launch request comes from 
> previous RM or the current RM. we should reject containers allocated by 
> previous RM 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to