[jira] [Updated] (YARN-149) ResourceManager (RM) High-Availability (HA)
[ https://issues.apache.org/jira/browse/YARN-149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wang Haoran updated YARN-149: - Affects Version/s: (was: 2.4.0) > ResourceManager (RM) High-Availability (HA) > --- > > Key: YARN-149 > URL: https://issues.apache.org/jira/browse/YARN-149 > Project: Hadoop YARN > Issue Type: New Feature > Components: resourcemanager >Reporter: Harsh J > Labels: patch > Attachments: YARN ResourceManager Automatic > Failover-rev-07-21-13.pdf, YARN ResourceManager Automatic > Failover-rev-08-04-13.pdf, rm-ha-phase1-approach-draft1.pdf, > rm-ha-phase1-draft2.pdf > > > This jira tracks work needed to be done to support one RM instance failing > over to another RM instance so that we can have RM HA. Work includes leader > election, transfer of control to leader and client re-direction to new leader. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-149) ResourceManager (RM) High-Availability (HA)
[ https://issues.apache.org/jira/browse/YARN-149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli updated YARN-149: - Component/s: resourcemanager Assignee: (was: Bikas Saha) Keeping it unassigned given multiple contributors. > ResourceManager (RM) High-Availability (HA) > --- > > Key: YARN-149 > URL: https://issues.apache.org/jira/browse/YARN-149 > Project: Hadoop YARN > Issue Type: New Feature > Components: resourcemanager >Reporter: Harsh J > Attachments: YARN ResourceManager Automatic > Failover-rev-07-21-13.pdf, YARN ResourceManager Automatic > Failover-rev-08-04-13.pdf, rm-ha-phase1-approach-draft1.pdf, > rm-ha-phase1-draft2.pdf > > > This jira tracks work needed to be done to support one RM instance failing > over to another RM instance so that we can have RM HA. Work includes leader > election, transfer of control to leader and client re-direction to new leader. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (YARN-149) ResourceManager (RM) High-Availability (HA)
[ https://issues.apache.org/jira/browse/YARN-149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bikas Saha updated YARN-149: Summary: ResourceManager (RM) High-Availability (HA) (was: ZK-based High Availability (HA) for ResourceManager (RM)) > ResourceManager (RM) High-Availability (HA) > --- > > Key: YARN-149 > URL: https://issues.apache.org/jira/browse/YARN-149 > Project: Hadoop YARN > Issue Type: New Feature >Reporter: Harsh J >Assignee: Bikas Saha > > One of the goals presented on MAPREDUCE-279 was to have high availability. > One way that was discussed, per Mahadev/others on > https://issues.apache.org/jira/browse/MAPREDUCE-2648 and other places, was ZK: > {quote} > Am not sure, if you already know about the MR-279 branch (the next version of > MR framework). We've been trying to integrate ZK into the framework from the > beginning. As for now, we are just doing restart with ZK but soon we should > have a HA soln with ZK. > {quote} > There is now MAPREDUCE-4343 that tracks recoverability via ZK. This JIRA is > meant to track HA via ZK. > Currently there isn't a HA solution for RM, via ZK or otherwise. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-149) ResourceManager (RM) High-Availability (HA)
[ https://issues.apache.org/jira/browse/YARN-149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Philip Zeyliger updated YARN-149: - Description: One of the goals presented on MAPREDUCE-279 was to have high availability. One way that was discussed, per Mahadev/others on https://issues.apache.org/jira/browse/MAPREDUCE-2648 and other places, was ZK: {quote} Am not sure, if you already know about the MR-279 branch (the next version of MR framework). We've been trying to integrate ZK into the framework from the beginning. As for now, we are just doing restart with ZK but soon we should have a HA soln with ZK. {quote} There is now MAPREDUCE-4343 that tracks recoverability via ZK. This JIRA is meant to track HA via ZK. Currently there isn't a HA solution for RM, via ZK or otherwise. was: One of the goals presented on MAPREDUCE-279 was to have high availability. One way that was discussed, per Mahadev/others on https://issues.apache.org/jira/browse/MAPREDUCE-2648 and other places, was ZK: {quote} Am not sure, if you already know about the MR-279 branch (the next version of MR framework). We've been trying to integrate ZK into the framework from the beginning. As for now, we are just doing restart with ZK but soon we should have a HA soln with ZK. {quote} There is now MAPREDUCE-4343 that tracks recoverability via ZK. This JIRA is meant to track HA via ZK. Currently there isn't a HA solution for RM, via ZK or otherwise. > ResourceManager (RM) High-Availability (HA) > --- > > Key: YARN-149 > URL: https://issues.apache.org/jira/browse/YARN-149 > Project: Hadoop YARN > Issue Type: New Feature >Reporter: Harsh J >Assignee: Bikas Saha > > One of the goals presented on MAPREDUCE-279 was to have high availability. > One way that was discussed, per Mahadev/others on > https://issues.apache.org/jira/browse/MAPREDUCE-2648 and other places, was ZK: > {quote} > Am not sure, if you already know about the MR-279 branch (the next version of > MR framework). We've been trying to integrate ZK into the framework from the > beginning. As for now, we are just doing restart with ZK but soon we should > have a HA soln with ZK. > {quote} > There is now MAPREDUCE-4343 that tracks recoverability via ZK. This JIRA is > meant to track HA via ZK. > Currently there isn't a HA solution for RM, via ZK or otherwise. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-149) ResourceManager (RM) High-Availability (HA)
[ https://issues.apache.org/jira/browse/YARN-149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bikas Saha updated YARN-149: Description: This jira tracks work needed to be done to support one RM instance failing over to another RM instance so that we can have RM HA. Work includes leader election, transfer of control to leader and client re-direction to new leader. (was: One of the goals presented on MAPREDUCE-279 was to have high availability. One way that was discussed, per Mahadev/others on https://issues.apache.org/jira/browse/MAPREDUCE-2648 and other places, was ZK: {quote} Am not sure, if you already know about the MR-279 branch (the next version of MR framework). We've been trying to integrate ZK into the framework from the beginning. As for now, we are just doing restart with ZK but soon we should have a HA soln with ZK. {quote} There is now MAPREDUCE-4343 that tracks recoverability via ZK. This JIRA is meant to track HA via ZK. Currently there isn't a HA solution for RM, via ZK or otherwise.) > ResourceManager (RM) High-Availability (HA) > --- > > Key: YARN-149 > URL: https://issues.apache.org/jira/browse/YARN-149 > Project: Hadoop YARN > Issue Type: New Feature >Reporter: Harsh J >Assignee: Bikas Saha > > This jira tracks work needed to be done to support one RM instance failing > over to another RM instance so that we can have RM HA. Work includes leader > election, transfer of control to leader and client re-direction to new leader. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-149) ResourceManager (RM) High-Availability (HA)
[ https://issues.apache.org/jira/browse/YARN-149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated YARN-149: -- Attachment: rm-ha-phase1-approach-draft1.pdf I have uploaded a basic design/approach document for phase 1 - rm-ha-phase1-approach-draft1.pdf. The doc basically proposes the use of cold standby and a RMHADaemon wrapper around RM for HA-related code. Please share your thoughts and comments to improve the design further. > ResourceManager (RM) High-Availability (HA) > --- > > Key: YARN-149 > URL: https://issues.apache.org/jira/browse/YARN-149 > Project: Hadoop YARN > Issue Type: New Feature >Reporter: Harsh J >Assignee: Bikas Saha > Attachments: rm-ha-phase1-approach-draft1.pdf > > > This jira tracks work needed to be done to support one RM instance failing > over to another RM instance so that we can have RM HA. Work includes leader > election, transfer of control to leader and client re-direction to new leader. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-149) ResourceManager (RM) High-Availability (HA)
[ https://issues.apache.org/jira/browse/YARN-149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated YARN-149: -- Attachment: (was: rm-ha-phase1-approach-draft1.pdf) > ResourceManager (RM) High-Availability (HA) > --- > > Key: YARN-149 > URL: https://issues.apache.org/jira/browse/YARN-149 > Project: Hadoop YARN > Issue Type: New Feature >Reporter: Harsh J >Assignee: Bikas Saha > Attachments: rm-ha-phase1-approach-draft1.pdf > > > This jira tracks work needed to be done to support one RM instance failing > over to another RM instance so that we can have RM HA. Work includes leader > election, transfer of control to leader and client re-direction to new leader. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-149) ResourceManager (RM) High-Availability (HA)
[ https://issues.apache.org/jira/browse/YARN-149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated YARN-149: -- Attachment: rm-ha-phase1-approach-draft1.pdf > ResourceManager (RM) High-Availability (HA) > --- > > Key: YARN-149 > URL: https://issues.apache.org/jira/browse/YARN-149 > Project: Hadoop YARN > Issue Type: New Feature >Reporter: Harsh J >Assignee: Bikas Saha > Attachments: rm-ha-phase1-approach-draft1.pdf > > > This jira tracks work needed to be done to support one RM instance failing > over to another RM instance so that we can have RM HA. Work includes leader > election, transfer of control to leader and client re-direction to new leader. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-149) ResourceManager (RM) High-Availability (HA)
[ https://issues.apache.org/jira/browse/YARN-149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated YARN-149: -- Attachment: rm-ha-phase1-draft2.pdf Uploading draft-2 with more details on the wrapper approach. > ResourceManager (RM) High-Availability (HA) > --- > > Key: YARN-149 > URL: https://issues.apache.org/jira/browse/YARN-149 > Project: Hadoop YARN > Issue Type: New Feature >Reporter: Harsh J >Assignee: Bikas Saha > Attachments: rm-ha-phase1-approach-draft1.pdf, rm-ha-phase1-draft2.pdf > > > This jira tracks work needed to be done to support one RM instance failing > over to another RM instance so that we can have RM HA. Work includes leader > election, transfer of control to leader and client re-direction to new leader. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-149) ResourceManager (RM) High-Availability (HA)
[ https://issues.apache.org/jira/browse/YARN-149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bikas Saha updated YARN-149: Attachment: YARN ResourceManager Automatic Failover-rev-07-21-13.pdf Attaching first revision of overall approach. I am sure something is missing and something else can be improved. Will incorporate feedback as it comes. Will soon start creating work items that make sense in a chronological ordering of work. Making incremental progress while keeping the RM stable is the desired course of action (like YARN-128). > ResourceManager (RM) High-Availability (HA) > --- > > Key: YARN-149 > URL: https://issues.apache.org/jira/browse/YARN-149 > Project: Hadoop YARN > Issue Type: New Feature >Reporter: Harsh J >Assignee: Bikas Saha > Attachments: rm-ha-phase1-approach-draft1.pdf, > rm-ha-phase1-draft2.pdf, YARN ResourceManager Automatic > Failover-rev-07-21-13.pdf > > > This jira tracks work needed to be done to support one RM instance failing > over to another RM instance so that we can have RM HA. Work includes leader > election, transfer of control to leader and client re-direction to new leader. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-149) ResourceManager (RM) High-Availability (HA)
[ https://issues.apache.org/jira/browse/YARN-149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bikas Saha updated YARN-149: Attachment: YARN ResourceManager Automatic Failover-rev-08-04-13.pdf Updating the document with minor updates based on comments. > ResourceManager (RM) High-Availability (HA) > --- > > Key: YARN-149 > URL: https://issues.apache.org/jira/browse/YARN-149 > Project: Hadoop YARN > Issue Type: New Feature >Reporter: Harsh J >Assignee: Bikas Saha > Attachments: rm-ha-phase1-approach-draft1.pdf, > rm-ha-phase1-draft2.pdf, YARN ResourceManager Automatic > Failover-rev-07-21-13.pdf, YARN ResourceManager Automatic > Failover-rev-08-04-13.pdf > > > This jira tracks work needed to be done to support one RM instance failing > over to another RM instance so that we can have RM HA. Work includes leader > election, transfer of control to leader and client re-direction to new leader. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-149) ResourceManager (RM) High-Availability (HA)
[ https://issues.apache.org/jira/browse/YARN-149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] shenhong updated YARN-149: -- Assignee: (was: shenhong) > ResourceManager (RM) High-Availability (HA) > --- > > Key: YARN-149 > URL: https://issues.apache.org/jira/browse/YARN-149 > Project: Hadoop YARN > Issue Type: New Feature >Reporter: Harsh J > Attachments: rm-ha-phase1-approach-draft1.pdf, > rm-ha-phase1-draft2.pdf, YARN ResourceManager Automatic > Failover-rev-07-21-13.pdf, YARN ResourceManager Automatic > Failover-rev-08-04-13.pdf > > > This jira tracks work needed to be done to support one RM instance failing > over to another RM instance so that we can have RM HA. Work includes leader > election, transfer of control to leader and client re-direction to new leader. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira