Re: Zookeeper cluster changes

Donald Laidlaw Tue, 10 Nov 2015 11:24:58 -0800

I agree, you want to apply the changes gradually so as not to lose a quorum. 
The problem is automating this so that it happens in a lights-out environment, 
in the cloud, without some poor slob's pager going off in the middle of the 
night :)


While health checks can detect and replace a dead server reliably on any number 
of clouds, the new server comes up with a new IP address. This server can 
reliably join into zookeeper ensemble. However, it is tough to automate the 
rolling restart of the other mesos servers, both masters and slaves, that needs 
to occur to keep them happy. 

One thing I have not tried is to just ignore the change, and use something to 
detect the masters just prior to starting mesos. If they truly fail fast, then 
if they lose a zookeeper connection, then maybe they don’t care that they have 
been started with an out-of-date list of zookeeper servers.

What does mesos-master and mesos-slave do with a list of zookeeper servers to 
connect to? Just try them in order until one works, then use that one until it 
fails? If so, and it fails fast, then letting it continue to run with a stale 
list will have no ill effects. Or does it keep trying the servers in the list 
when a connection fails? 

Don Laidlaw


> On Nov 10, 2015, at 4:42 AM, Erik Weathers <eweath...@groupon.com> wrote:
> 
> Keep in mind that mesos is designed to "fail fast".  So when there are 
> problems (such as losing connectivity to the resolved ZooKeeper IP) the 
> daemon(s) (master & slave) die.
> 
> Due to this design, we are all supposed to run the mesos daemons under 
> "supervision", which means auto-restart after they crash.  This can be done 
> with monit/god/runit/etc.
> 
> So, to perform maintenance on ZooKeeper, I would firstly ensure the 
> mesos-master processes are running under "supervision" so that they restart 
> quickly after a ZK connectivity failure occurs.  Then proceed with standard 
> ZooKeeper maintenance (exhibitor-based or manual), pausing between downing of 
> ZK servers to ensure you have "enough" mesos-master processes running.  (I 
> *would* say a "pausing until you have a quorum of mesos-masters up", but if 
> you only have 2 of 3 up and then take down the ZK that the leader is 
> connected to, that would be temporarily bad.  So I'd make sure they're all 
> up.)
> 
> - Erik
> 
> On Mon, Nov 9, 2015 at 11:07 PM, Marco Massenzio <ma...@mesosphere.io 
> <mailto:ma...@mesosphere.io>> wrote:
> The way I would do it in a production cluster would be *not* to use directly 
> IP addresses for the ZK ensemble, but instead rely on some form of internal 
> DNS and use internally-resolvable hostnames (eg, {zk1, zk2, 
> ...}.prod.example.com <http://prod.example.com/> etc) and have the 
> provisioning tooling (Chef, Puppet, Ansible, what have you) handle the 
> setting of the hostname when restarting/replacing a failing/crashed ZK server.
> 
> This way your list of zk's to Mesos never changes, even though the FQN's will 
> map to different IPs / VMs.
> 
> Obviously, this may not be always desirable / feasible (eg, if your prod 
> environment does not support DNS resolution).
> 
> You are correct in that Mesos does not currently support dynamically changing 
> the ZK's addresses, but I don't know whether that's a limitation of Mesos 
> code or of the ZK C++ client driver.
> I'll look into it and let you know what I find (if anything).
> 
> --
> Marco Massenzio
> Distributed Systems Engineer
> http://codetrips.com <http://codetrips.com/>
> 
> On Mon, Nov 9, 2015 at 6:01 AM, Donald Laidlaw <donlaid...@me.com 
> <mailto:donlaid...@me.com>> wrote:
> How do mesos masters and slaves react to zookeeper cluster changes? When the 
> masters and slaves start they are given a set of addresses to connect to 
> zookeeper. But over time, one of those zookeepers fails, and is replaced by a 
> new server at a new address. How should this be handled in the mesos servers?
> 
> I am guessing that mesos does not automatically detect and react to that 
> change. But obviously we should do something to keep the mesos servers happy 
> as well. What should be do?
> 
> The obvious thing is to stop the mesos servers, one at a time, and restart 
> them with the new configuration. But it would be really nice to be able to do 
> this dynamically without restarting the server. After all, coordinating a 
> rolling restart is a fairly hard job.
> 
> Any suggestions or pointers?
> 
> Best regards,
> Don Laidlaw
> 
> 
> 
>

Re: Zookeeper cluster changes

Reply via email to