Hi Fred, hm, if the bug dependents on Ubuntu version, my random guess is that it's systemd related. Were you able to solve the problem? If not, it would be helpful if you provide more context and describe a minimal setup that reproduces the issue.
On Thu, Dec 10, 2015 at 10:15 AM, Frederic LE BRIS <fleb...@pagesjaunes.fr> wrote: > Thanks Alex. > > About the context, we use spark on mesos and marathon to launch some > elastisearch, > > I kill each leader one-by-one. > > By the way as I said, we are on a config Mesos-Master on ubuntu 12, and > mesos-slave on ubuntu 14, to reproduce this comportement. > > When I deploy only on Ubuntu 14 master+slave, the issue disappear … > > Fred > > > > > > > On 09 Dec 2015, at 16:30, Alex Rukletsov <a...@mesosphere.com> wrote: > > Frederic, > > I have skimmed through the logs and they are do not seem to be complete > (especially for master1). Could you please say what task has been killed > (id) and which master failover triggered that? I see at least three > failovers in the logs : ). Also, could you please share some background > about your setup? I believe you're on systemd, do you use docker tasks? > > To connect our conversation to particular events, let me post here the > chain of (potentially) interesting events and some info I mined from the > logs. > master1: 192.168.37.59 ? > master2: 192.168.37.58 > master3: 192.168.37.104 > > timestamp observed by event > 13:48:38 master1 master1 killed by sigterm > 13:48:48 master2,3 new leader elected (192.168.37.104), id=5 > 13:49:25 master2 master2 killed by sigterm > 13:50:44 master2,3 new leader elected (192.168.37.59), id=7 > 14:23:34 master1 master1 killed by sigterm > 14:23:44 master2,3 new leader elected (192.168.37.58), id=8 > > One interesting thing I cannot understand is why master3 did not commit > suicide when it lost leadership? > > > On Mon, Dec 7, 2015 at 4:08 PM, Frederic LE BRIS <fleb...@pagesjaunes.fr> > wrote: > >> With the context .. sorry >> >> > >