[ 
https://issues.apache.org/jira/browse/YARN-479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13623979#comment-13623979
 ] 

Bikas Saha commented on YARN-479:
---------------------------------

Patch looks good overall!

why is this line showing as a diff
{code}
-    List<ContainerStatus> containersStatuses = new 
ArrayList<ContainerStatus>();
...
+    List<ContainerStatus> containersStatuses = new 
ArrayList<ContainerStatus>();
{code}


why is this change needed?
{code}
   public void testNMRegistration() throws InterruptedException {
+    final long connectionWaitSecs = 5;
+    final long connectionRetryIntervalSecs = 1;
+    YarnConfiguration conf = createNMConfig();
+    conf.setLong(YarnConfiguration.RESOURCEMANAGER_CONNECT_WAIT_SECS,
+        connectionWaitSecs);
+    conf.setLong(YarnConfiguration
+        .RESOURCEMANAGER_CONNECT_RETRY_INTERVAL_SECS,
+        connectionRetryIntervalSecs);
+
     nm = new NodeManager() {
{code}

and this change needed?
{code}
@@ -527,7 +599,6 @@ protected NodeStatusUpdater createNodeStatusUpdater(Context 
context,
       }
     };
 
-    YarnConfiguration conf = createNMConfig();
     nm.init(conf);
{code}

The message can be made part of the Assert
{code}
+    //calculate heartBeatCount based on connectionWaitSecs and 
RetryIntervalSecs
+    Assert.assertTrue(heartBeatCount == 2);
{code}

Can we pass in the barrier etc into the custom derived classes of nodemanager, 
rmservice etc so that we can avoid global vars?
                
> NM retry behavior for connection to RM should be similar for lost heartbeats
> ----------------------------------------------------------------------------
>
>                 Key: YARN-479
>                 URL: https://issues.apache.org/jira/browse/YARN-479
>             Project: Hadoop YARN
>          Issue Type: Bug
>            Reporter: Hitesh Shah
>            Assignee: Jian He
>         Attachments: YARN-479.1.patch, YARN-479.2.patch, YARN-479.3.patch, 
> YARN-479.4.patch, YARN-479.5.patch, YARN-479.6.patch, YARN-479.7.patch, 
> YARN-479.8.patch
>
>
> Regardless of connection loss at the start or at an intermediate point, NM's 
> retry behavior to the RM should follow the same flow. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to