The way I would do it in a production cluster would be *not* to use
directly IP addresses for the ZK ensemble, but instead rely on some form of
internal DNS and use internally-resolvable hostnames (eg, {zk1, zk2, ...}.
prod.example.com etc) and have the provisioning tooling (Chef, Puppet,
Ansible, what have you) handle the setting of the hostname when
restarting/replacing a failing/crashed ZK server.

This way your list of zk's to Mesos never changes, even though the FQN's
will map to different IPs / VMs.

Obviously, this may not be always desirable / feasible (eg, if your prod
environment does not support DNS resolution).

You are correct in that Mesos does not currently support dynamically
changing the ZK's addresses, but I don't know whether that's a limitation
of Mesos code or of the ZK C++ client driver.
I'll look into it and let you know what I find (if anything).

--
*Marco Massenzio*
Distributed Systems Engineer
http://codetrips.com

On Mon, Nov 9, 2015 at 6:01 AM, Donald Laidlaw <donlaid...@me.com> wrote:

> How do mesos masters and slaves react to zookeeper cluster changes? When
> the masters and slaves start they are given a set of addresses to connect
> to zookeeper. But over time, one of those zookeepers fails, and is replaced
> by a new server at a new address. How should this be handled in the mesos
> servers?
>
> I am guessing that mesos does not automatically detect and react to that
> change. But obviously we should do something to keep the mesos servers
> happy as well. What should be do?
>
> The obvious thing is to stop the mesos servers, one at a time, and restart
> them with the new configuration. But it would be really nice to be able to
> do this dynamically without restarting the server. After all, coordinating
> a rolling restart is a fairly hard job.
>
> Any suggestions or pointers?
>
> Best regards,
> Don Laidlaw
>
>
>

Reply via email to