[ https://issues.apache.org/jira/browse/YARN-1307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13822359#comment-13822359 ]
Tsuyoshi OZAWA commented on YARN-1307: -------------------------------------- On recovery time, you're correct - your proposal can reduce the overhead of failover. It affects the failover time. My concern is the calling cost of removeApplicationState() gets higher if we take an approach to have all nodes under RM_APP_ROOT. If we run lots applications including short-lived applications at the same time, ZooKeeper can get lots load even when RM is healthy. Is it acceptable for us? IMO, we should avoid the situation. Separating znode by application ids is also useful from a point of a view of operators. If RM fail to launch AppMaster with states in ZKRMtateStore because of bugs or something, manual deletion of illegal states under the application id can be done more easily. > Rethink znode structure for RM HA > --------------------------------- > > Key: YARN-1307 > URL: https://issues.apache.org/jira/browse/YARN-1307 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager > Reporter: Tsuyoshi OZAWA > Assignee: Tsuyoshi OZAWA > Attachments: YARN-1307.1.patch, YARN-1307.2.patch, YARN-1307.3.patch, > YARN-1307.4-2.patch, YARN-1307.4-3.patch, YARN-1307.4.patch, YARN-1307.5.patch > > > Rethink for znode structure for RM HA is proposed in some JIRAs(YARN-659, > YARN-1222). The motivation of this JIRA is quoted from Bikas' comment in > YARN-1222: > {quote} > We should move to creating a node hierarchy for apps such that all znodes for > an app are stored under an app znode instead of the app root znode. This will > help in removeApplication and also in scaling better on ZK. The earlier code > was written this way to ensure create/delete happens under a root znode for > fencing. But given that we have moved to multi-operations globally, this isnt > required anymore. > {quote} -- This message was sent by Atlassian JIRA (v6.1#6144)