Piotr, Thank you for this link. I am looking at it now where I right away notice that Exhibitor is designed to monitor (and backup) Zookeeper (but not anything related to Mesos itself). Don't the Mesos master & agent nodes keep at least some state outside of the ZK znodes, e.g., under the default workdir?
Shua, Thank you for this observation. Happily (I think), we do not have a custom framework. Presently, Marathon is the only framework that we use. -Paul On Mon, Apr 11, 2016 at 8:12 AM, Shuai Lin <linshuai2...@gmail.com> wrote: > If your product containers a custom framework, at least you should > implement kind of high availability for your scheduler (like > marathon/chronos does), or let it be launched by marathon so it can be > restarted when it fails. > > On Mon, Apr 11, 2016 at 7:27 PM, Paul Bell <arach...@gmail.com> wrote: > >> Hi All, >> >> As we get closer to shipping a Mesos-based version of our product, we've >> turned our attention to "protecting" (supporting backup & recovery) of not >> only our application databases, but the cluster as well. >> >> I'm not quite sure how to begin thinking about this, but I suppose the >> usual dimensions of B/R would come into play, e.g., hot/cold, application >> consistent/crash consistent, etc. >> >> Has anyone grappled with this issue and, if so, would you be so kind as >> to share your experience and solutions? >> >> Thank you. >> >> -Paul >> >> >