If your product containers a custom framework, at least you should implement kind of high availability for your scheduler (like marathon/chronos does), or let it be launched by marathon so it can be restarted when it fails.
On Mon, Apr 11, 2016 at 7:27 PM, Paul Bell <arach...@gmail.com> wrote: > Hi All, > > As we get closer to shipping a Mesos-based version of our product, we've > turned our attention to "protecting" (supporting backup & recovery) of not > only our application databases, but the cluster as well. > > I'm not quite sure how to begin thinking about this, but I suppose the > usual dimensions of B/R would come into play, e.g., hot/cold, application > consistent/crash consistent, etc. > > Has anyone grappled with this issue and, if so, would you be so kind as to > share your experience and solutions? > > Thank you. > > -Paul > >