[ 
https://issues.apache.org/jira/browse/YARN-353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13761573#comment-13761573
 ] 

Hitesh Shah commented on YARN-353:
----------------------------------

hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/pom.xml
  - shouldn't there be a dependency on zookeeper even in the normal scope? 

{code}
+      if (childNodeName.startsWith(DELEGATION_TOKEN_SEQUENCE_NUMBER_PREFIX)) {
+        rmState.rmSecretManagerState.dtSequenceNumber =
+            Integer.parseInt(childNodeName.split("_")[1]);
+        continue;
+      }
{code}
  - Could you clarify whether there can be multiple child nodes prefixed with 
DELEGATION_TOKEN_SEQUENCE_NUMBER_PREFIX in any possible state variation?

{code}
+          // assert child node name is same as actual applicationId
+          assert appId.equals(appState.context.getApplicationId());
{code}
  - why the need for an assert? Should this check throw a runtime exception 
instead? (likewise for other assert checks )

{code}
+    } catch (Exception e) {
+      // currently throw all exceptions. May need to respond differently for HA
+      // based on whether we have lost the right to write to ZK
+      // TODO: Revisit this post YARN-149
+      throw e;
+    }
{code}
  - I believe its better to just remove such code and add it in with HA patches.

{code}
+        /**
+         * Call exists() to leave a watch on the node denoted by path.
+         * Delete node if exists. To pass the existence information to the
+         * caller, call delete irrespective of whether node exists or not.
+         */
+        if (zkClient.exists(path, true) == null) {
+          LOG.error("Trying to delete a path (" + path
+              + ") that doesn't exist.");
+        } else {
+          zkClient.delete(path, version);
+        }
{code}
  - code does not match the comment ( with respect to passing of existence 
information )

{code}
+    } catch (Exception e) {
+      e.printStackTrace();
+      Assert.fail("ZKRMStateStore Session restore failed");
+    }
{code}
  - Don't think there is any need to catch the exception. The unit test will 
fail if the exception is not caught. If the exception stack trace in the unit 
test logs is not useful enough to understand the failure reason, it may be 
better to fix the code as needed. ( likewise in the other couple of places in 
the unit test where exceptions are being caught and handled with an 
assert.fail()

{code}
+    Thread.sleep(800);
{code}
  - the zk unit test has magic sleeps of 800 ms in some cases, 500 ms in 
others. What is the reason for these different numbers? Does the test helper 
need augmenting to remove this timing related dependency?

General minor nits:
  - 80 chars line limit exceeded in multiple files. 

                
> Add Zookeeper-based store implementation for RMStateStore
> ---------------------------------------------------------
>
>                 Key: YARN-353
>                 URL: https://issues.apache.org/jira/browse/YARN-353
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: resourcemanager
>            Reporter: Hitesh Shah
>            Assignee: Karthik Kambatla
>         Attachments: YARN-353.10.patch, YARN-353.11.patch, YARN-353.12.patch, 
> yarn-353-12-wip.patch, YARN-353.13.patch, YARN-353.14.patch, 
> YARN-353.15.patch, YARN-353.1.patch, YARN-353.2.patch, YARN-353.3.patch, 
> YARN-353.4.patch, YARN-353.5.patch, YARN-353.6.patch, YARN-353.7.patch, 
> YARN-353.8.patch, YARN-353.9.patch
>
>
> Add store that write RM state data to ZK

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to