[ https://issues.apache.org/jira/browse/MAPREDUCE-4326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13424504#comment-13424504 ]
Tsuyoshi OZAWA commented on MAPREDUCE-4326: ------------------------------------------- I've looked around the code of RM, and I've found that the current Recoverable interface provides storing the states as follows: 1. information about application(application ids and info defined in ApplicationId.java and ApplicationSubmissionContext.java). 2. Information about node managers(info about Node Manager defined in RMNode.java). My questions are: 1. Are the states enough to store? In my looking around the code, RMContext has the other states, however, the states are recoverable without the store. 2. When the states should be saved onto the store? 3. When the interface getLastLoggedNodeId() is used? IMHO, we should go step by step as follows: 1. Define the states of RM, which are preserved onto MemStore/DiskStore/ZKStore. 2. Implement the resurrectable version when the RM crashed(ex. DiskStore/ZKStore). Prototyping 2 and testing it will prove the correctness of 1. If you have any ideas, please let me know. > Resurrect RM Restart > --------------------- > > Key: MAPREDUCE-4326 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-4326 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: mrv2, resourcemanager > Affects Versions: 2.0.0-alpha > Reporter: Arun C Murthy > Assignee: Bikas Saha > Attachments: MR-4343.1.patch > > > We should resurrect 'RM Restart' which we disabled sometime during the RM > refactor. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira