[ https://issues.apache.org/jira/browse/YARN-1341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14036592#comment-14036592 ]
Junping Du commented on YARN-1341: ---------------------------------- Thanks for updating the patch, [~jlowe]! Some minor comments: The change in BaseContainerTokenSecretManager.java is not necessary and I believe that belongs to YARN-1342. Let’s remove it for this patch. In NodeManager.java, {code} NMTokenSecretManagerInNM nmTokenSecretManager = - new NMTokenSecretManagerInNM(); + new NMTokenSecretManagerInNM(nmStore); + + if (nmStore.canRecover()) { + nmTokenSecretManager.recover(nmStore.loadNMTokenState()); + } {code} Can we consolidate the code in a separated method together with NMContainerTokenSecretManager as we will do similar thing to recover ContainerToken staff which make code have duplicated things? In NMTokenSecretManagerInNM.java, {code} + private void updateCurrentMasterKey(MasterKeyData key) { + super.currentMasterKey = key; + try { + stateStore.storeNMTokenCurrentMasterKey(key.getMasterKey()); + } catch (IOException e) { + LOG.error("Unable to update current master key in state store", e); + } + } + + private void updatePreviousMasterKey(MasterKeyData key) { + previousMasterKey = key; + try { + stateStore.storeNMTokenPreviousMasterKey(key.getMasterKey()); + } catch (IOException e) { + LOG.error("Unable to update previous master key in state store", e); + } + } {code} Does log error here is just enough in case of failure in store? If Master key is updated but not persistent, then it could cause some inconsistency when recover it. I think we should throw some exception here if store get failed and rollback the key just set. Thoughts? > Recover NMTokens upon nodemanager restart > ----------------------------------------- > > Key: YARN-1341 > URL: https://issues.apache.org/jira/browse/YARN-1341 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager > Affects Versions: 2.3.0 > Reporter: Jason Lowe > Assignee: Jason Lowe > Attachments: YARN-1341.patch, YARN-1341v2.patch, YARN-1341v3.patch, > YARN-1341v4-and-YARN-1987.patch, YARN-1341v5.patch > > -- This message was sent by Atlassian JIRA (v6.2#6252)