[jira] [Updated] (YARN-149) ResourceManager (RM) High-Availability (HA)

2014-12-30 Thread Wang Haoran (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wang Haoran updated YARN-149:
-
Affects Version/s: (was: 2.4.0)

> ResourceManager (RM) High-Availability (HA)
> ---
>
> Key: YARN-149
> URL: https://issues.apache.org/jira/browse/YARN-149
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: resourcemanager
>Reporter: Harsh J
>  Labels: patch
> Attachments: YARN ResourceManager Automatic 
> Failover-rev-07-21-13.pdf, YARN ResourceManager Automatic 
> Failover-rev-08-04-13.pdf, rm-ha-phase1-approach-draft1.pdf, 
> rm-ha-phase1-draft2.pdf
>
>
> This jira tracks work needed to be done to support one RM instance failing 
> over to another RM instance so that we can have RM HA. Work includes leader 
> election, transfer of control to leader and client re-direction to new leader.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-149) ResourceManager (RM) High-Availability (HA)

2014-03-22 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated YARN-149:
-

Component/s: resourcemanager
   Assignee: (was: Bikas Saha)

Keeping it unassigned given multiple contributors.

> ResourceManager (RM) High-Availability (HA)
> ---
>
> Key: YARN-149
> URL: https://issues.apache.org/jira/browse/YARN-149
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: resourcemanager
>Reporter: Harsh J
> Attachments: YARN ResourceManager Automatic 
> Failover-rev-07-21-13.pdf, YARN ResourceManager Automatic 
> Failover-rev-08-04-13.pdf, rm-ha-phase1-approach-draft1.pdf, 
> rm-ha-phase1-draft2.pdf
>
>
> This jira tracks work needed to be done to support one RM instance failing 
> over to another RM instance so that we can have RM HA. Work includes leader 
> election, transfer of control to leader and client re-direction to new leader.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (YARN-149) ResourceManager (RM) High-Availability (HA)

2013-02-28 Thread Bikas Saha (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bikas Saha updated YARN-149:


Summary: ResourceManager (RM) High-Availability (HA)  (was: ZK-based High 
Availability (HA) for ResourceManager (RM))

> ResourceManager (RM) High-Availability (HA)
> ---
>
> Key: YARN-149
> URL: https://issues.apache.org/jira/browse/YARN-149
> Project: Hadoop YARN
>  Issue Type: New Feature
>Reporter: Harsh J
>Assignee: Bikas Saha
>
> One of the goals presented on MAPREDUCE-279 was to have high availability. 
> One way that was discussed, per Mahadev/others on 
> https://issues.apache.org/jira/browse/MAPREDUCE-2648 and other places, was ZK:
> {quote}
> Am not sure, if you already know about the MR-279 branch (the next version of 
> MR framework). We've been trying to integrate ZK into the framework from the 
> beginning. As for now, we are just doing restart with ZK but soon we should 
> have a HA soln with ZK.
> {quote}
> There is now MAPREDUCE-4343 that tracks recoverability via ZK. This JIRA is 
> meant to track HA via ZK.
> Currently there isn't a HA solution for RM, via ZK or otherwise.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-149) ResourceManager (RM) High-Availability (HA)

2013-04-15 Thread Philip Zeyliger (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Philip Zeyliger updated YARN-149:
-

Description: 
 One of the goals presented on MAPREDUCE-279 was to have high availability. One 
way that was discussed, per Mahadev/others on 
https://issues.apache.org/jira/browse/MAPREDUCE-2648 and other places, was ZK:

{quote}
Am not sure, if you already know about the MR-279 branch (the next version of 
MR framework). We've been trying to integrate ZK into the framework from the 
beginning. As for now, we are just doing restart with ZK but soon we should 
have a HA soln with ZK.
{quote}

There is now MAPREDUCE-4343 that tracks recoverability via ZK. This JIRA is 
meant to track HA via ZK.

Currently there isn't a HA solution for RM, via ZK or otherwise.

  was:
One of the goals presented on MAPREDUCE-279 was to have high availability. One 
way that was discussed, per Mahadev/others on 
https://issues.apache.org/jira/browse/MAPREDUCE-2648 and other places, was ZK:

{quote}
Am not sure, if you already know about the MR-279 branch (the next version of 
MR framework). We've been trying to integrate ZK into the framework from the 
beginning. As for now, we are just doing restart with ZK but soon we should 
have a HA soln with ZK.
{quote}

There is now MAPREDUCE-4343 that tracks recoverability via ZK. This JIRA is 
meant to track HA via ZK.

Currently there isn't a HA solution for RM, via ZK or otherwise.


> ResourceManager (RM) High-Availability (HA)
> ---
>
> Key: YARN-149
> URL: https://issues.apache.org/jira/browse/YARN-149
> Project: Hadoop YARN
>  Issue Type: New Feature
>Reporter: Harsh J
>Assignee: Bikas Saha
>
>  One of the goals presented on MAPREDUCE-279 was to have high availability. 
> One way that was discussed, per Mahadev/others on 
> https://issues.apache.org/jira/browse/MAPREDUCE-2648 and other places, was ZK:
> {quote}
> Am not sure, if you already know about the MR-279 branch (the next version of 
> MR framework). We've been trying to integrate ZK into the framework from the 
> beginning. As for now, we are just doing restart with ZK but soon we should 
> have a HA soln with ZK.
> {quote}
> There is now MAPREDUCE-4343 that tracks recoverability via ZK. This JIRA is 
> meant to track HA via ZK.
> Currently there isn't a HA solution for RM, via ZK or otherwise.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-149) ResourceManager (RM) High-Availability (HA)

2013-07-01 Thread Bikas Saha (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bikas Saha updated YARN-149:


Description: This jira tracks work needed to be done to support one RM 
instance failing over to another RM instance so that we can have RM HA. Work 
includes leader election, transfer of control to leader and client re-direction 
to new leader.  (was:  One of the goals presented on MAPREDUCE-279 was to have 
high availability. One way that was discussed, per Mahadev/others on 
https://issues.apache.org/jira/browse/MAPREDUCE-2648 and other places, was ZK:

{quote}
Am not sure, if you already know about the MR-279 branch (the next version of 
MR framework). We've been trying to integrate ZK into the framework from the 
beginning. As for now, we are just doing restart with ZK but soon we should 
have a HA soln with ZK.
{quote}

There is now MAPREDUCE-4343 that tracks recoverability via ZK. This JIRA is 
meant to track HA via ZK.

Currently there isn't a HA solution for RM, via ZK or otherwise.)

> ResourceManager (RM) High-Availability (HA)
> ---
>
> Key: YARN-149
> URL: https://issues.apache.org/jira/browse/YARN-149
> Project: Hadoop YARN
>  Issue Type: New Feature
>Reporter: Harsh J
>Assignee: Bikas Saha
>
> This jira tracks work needed to be done to support one RM instance failing 
> over to another RM instance so that we can have RM HA. Work includes leader 
> election, transfer of control to leader and client re-direction to new leader.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-149) ResourceManager (RM) High-Availability (HA)

2013-07-07 Thread Karthik Kambatla (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla updated YARN-149:
--

Attachment: rm-ha-phase1-approach-draft1.pdf

I have uploaded a basic design/approach document for phase 1 - 
rm-ha-phase1-approach-draft1.pdf. The doc basically proposes the use of cold 
standby and a RMHADaemon wrapper around RM for HA-related code.

Please share your thoughts and comments to improve the design further.

> ResourceManager (RM) High-Availability (HA)
> ---
>
> Key: YARN-149
> URL: https://issues.apache.org/jira/browse/YARN-149
> Project: Hadoop YARN
>  Issue Type: New Feature
>Reporter: Harsh J
>Assignee: Bikas Saha
> Attachments: rm-ha-phase1-approach-draft1.pdf
>
>
> This jira tracks work needed to be done to support one RM instance failing 
> over to another RM instance so that we can have RM HA. Work includes leader 
> election, transfer of control to leader and client re-direction to new leader.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-149) ResourceManager (RM) High-Availability (HA)

2013-07-07 Thread Karthik Kambatla (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla updated YARN-149:
--

Attachment: (was: rm-ha-phase1-approach-draft1.pdf)

> ResourceManager (RM) High-Availability (HA)
> ---
>
> Key: YARN-149
> URL: https://issues.apache.org/jira/browse/YARN-149
> Project: Hadoop YARN
>  Issue Type: New Feature
>Reporter: Harsh J
>Assignee: Bikas Saha
> Attachments: rm-ha-phase1-approach-draft1.pdf
>
>
> This jira tracks work needed to be done to support one RM instance failing 
> over to another RM instance so that we can have RM HA. Work includes leader 
> election, transfer of control to leader and client re-direction to new leader.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-149) ResourceManager (RM) High-Availability (HA)

2013-07-07 Thread Karthik Kambatla (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla updated YARN-149:
--

Attachment: rm-ha-phase1-approach-draft1.pdf

> ResourceManager (RM) High-Availability (HA)
> ---
>
> Key: YARN-149
> URL: https://issues.apache.org/jira/browse/YARN-149
> Project: Hadoop YARN
>  Issue Type: New Feature
>Reporter: Harsh J
>Assignee: Bikas Saha
> Attachments: rm-ha-phase1-approach-draft1.pdf
>
>
> This jira tracks work needed to be done to support one RM instance failing 
> over to another RM instance so that we can have RM HA. Work includes leader 
> election, transfer of control to leader and client re-direction to new leader.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-149) ResourceManager (RM) High-Availability (HA)

2013-07-10 Thread Karthik Kambatla (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla updated YARN-149:
--

Attachment: rm-ha-phase1-draft2.pdf

Uploading draft-2 with more details on the wrapper approach.

> ResourceManager (RM) High-Availability (HA)
> ---
>
> Key: YARN-149
> URL: https://issues.apache.org/jira/browse/YARN-149
> Project: Hadoop YARN
>  Issue Type: New Feature
>Reporter: Harsh J
>Assignee: Bikas Saha
> Attachments: rm-ha-phase1-approach-draft1.pdf, rm-ha-phase1-draft2.pdf
>
>
> This jira tracks work needed to be done to support one RM instance failing 
> over to another RM instance so that we can have RM HA. Work includes leader 
> election, transfer of control to leader and client re-direction to new leader.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-149) ResourceManager (RM) High-Availability (HA)

2013-07-21 Thread Bikas Saha (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bikas Saha updated YARN-149:


Attachment: YARN ResourceManager Automatic Failover-rev-07-21-13.pdf

Attaching first revision of overall approach. I am sure something is missing 
and something else can be improved. Will incorporate feedback as it comes. Will 
soon start creating work items that make sense in a chronological ordering of 
work. Making incremental progress while keeping the RM stable is the desired 
course of action (like YARN-128).

> ResourceManager (RM) High-Availability (HA)
> ---
>
> Key: YARN-149
> URL: https://issues.apache.org/jira/browse/YARN-149
> Project: Hadoop YARN
>  Issue Type: New Feature
>Reporter: Harsh J
>Assignee: Bikas Saha
> Attachments: rm-ha-phase1-approach-draft1.pdf, 
> rm-ha-phase1-draft2.pdf, YARN ResourceManager Automatic 
> Failover-rev-07-21-13.pdf
>
>
> This jira tracks work needed to be done to support one RM instance failing 
> over to another RM instance so that we can have RM HA. Work includes leader 
> election, transfer of control to leader and client re-direction to new leader.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-149) ResourceManager (RM) High-Availability (HA)

2013-08-04 Thread Bikas Saha (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bikas Saha updated YARN-149:


Attachment: YARN ResourceManager Automatic Failover-rev-08-04-13.pdf

Updating the document with minor updates based on comments.

> ResourceManager (RM) High-Availability (HA)
> ---
>
> Key: YARN-149
> URL: https://issues.apache.org/jira/browse/YARN-149
> Project: Hadoop YARN
>  Issue Type: New Feature
>Reporter: Harsh J
>Assignee: Bikas Saha
> Attachments: rm-ha-phase1-approach-draft1.pdf, 
> rm-ha-phase1-draft2.pdf, YARN ResourceManager Automatic 
> Failover-rev-07-21-13.pdf, YARN ResourceManager Automatic 
> Failover-rev-08-04-13.pdf
>
>
> This jira tracks work needed to be done to support one RM instance failing 
> over to another RM instance so that we can have RM HA. Work includes leader 
> election, transfer of control to leader and client re-direction to new leader.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-149) ResourceManager (RM) High-Availability (HA)

2013-09-15 Thread shenhong (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

shenhong updated YARN-149:
--

Assignee: (was: shenhong)

> ResourceManager (RM) High-Availability (HA)
> ---
>
> Key: YARN-149
> URL: https://issues.apache.org/jira/browse/YARN-149
> Project: Hadoop YARN
>  Issue Type: New Feature
>Reporter: Harsh J
> Attachments: rm-ha-phase1-approach-draft1.pdf, 
> rm-ha-phase1-draft2.pdf, YARN ResourceManager Automatic 
> Failover-rev-07-21-13.pdf, YARN ResourceManager Automatic 
> Failover-rev-08-04-13.pdf
>
>
> This jira tracks work needed to be done to support one RM instance failing 
> over to another RM instance so that we can have RM HA. Work includes leader 
> election, transfer of control to leader and client re-direction to new leader.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira