Adam, I am not able to find time to continue my work on HA. Most of my changes are already in phase1 branch, and some changes from review with Ken are in "issue_16" branch.
Paul, If Myriad is started with 'checkpoint: true' option in myriad-config yaml, then if RM/Scheduler accidently dies the NM/Executor won't exit and will not be killed my Mesos until 'frameworkFailoverTimeout' timeout again specified in myriad-config yaml. As Adam mentioned, restart of Scheduler/RM will trigger mesos task reconciliation that will re-sync the TASK state of NM tasks. Now, about issue #55. You are right, if we start Myriad with 'checkpoint: false', and the RM/Scheduler dies, then Mesos will take out any running tasks (NM) launched my Myriad. But, unless it's a test cluster most folks would like cluster to be resilient of RM/Scheduler failures, and would like NM tasks to keep running until RM/Scheduler comes back. And, you are right, as part of HA work, we would like to store task statuses in ZK/Mesos log to allow recovery via reconciliation post RM failure. Regards Mohit On Wed, Apr 1, 2015 at 12:47 PM, Adam Bordelon <[email protected]> wrote: > I know one of the things Mohit was working on was scheduler HA and task > reconciliation, so that if the RM dies, then Marathon (or another PaaS) > could restart it elsewhere, recover its state, and reconnect to running > tasks. In this scenario, we definitely want to keep the executors/NMs > running when the RM/scheduler exits. > See https://github.com/mesos/myriad/issues/13 and > https://github.com/mesos/myriad/issues/16 > Mohit, what's the status of this work? Do you have a branch you can share > that others can continue on, if you don't have time to complete it > yourself? > > RM HA via multiple RMs is a separate consideration: > > http://www.cloudera.com/content/cloudera/en/documentation/core/latest/topics/cdh_hag_rm_ha_config.html > There are a few considerations with having multiple RMs (active+standby) > hence multiple schedulers. Only one can be registered with Mesos at a time, > so we'd want to ensure that the MyriadScheduler only registers with Mesos > if it is the active RM. We'd also want to store/retrieve scheduler state > (in ZK/wherever) when failing over to another RM/Scheduler, but we'll do > that for general single-RM HA anyway. > > On Wed, Apr 1, 2015 at 5:33 AM, Paul Read <[email protected]> wrote: > > > If you don't mind adding a connection to zookeeper, storing the tasking > > status by host and instance in zookeeper, cleaning up on a graceful RM > die, > > then you should be able to recover at virtually any point. And have > > multiple RMs if that is a goal. > > > > Not sure at this point if the Executor would need to connect to zookeeper > > or just the scheduler. At first glance I would think just the Scheduler > > however if the RM accidentally dies and then the Executor is killed it > may > > be reasonable to have it update ZK with status...or just have any RM when > > it comes up to re-sync by requesting a sync msg and if it does not get > one > > in a reasonable amount of time assume its dead...could go so far as to > > track PIDs and test to see if they are out there as well. > > > > Just a few thoughts. > > > > On Wed, Apr 1, 2015 at 5:31 AM, Paul Read <[email protected]> wrote: > > > > > > > > Is it reasonable to expect the Executor and NM to exit if the the > > > RM/Scheduler accidently dies or is killed? Or should a restart of the > > > RM/Scheduler re-sync with the running Executor/NM ? > > > > > > I know there is currently no mechanism to do that but I was looking at > > > issue #55 and part of the problem/solution would be eliminated if the > > child > > > tasks were to terminate if the RM dies. > > > > > > > > > > > >
