Hi,

I experimented today running mesos masters & slaves with multiple masters
using zookeeper, by editing the /etc/mesos/zk file on all nodes (masters
and slaves) to something like:
zk://master1:2181,master2:2181,master3:2181/mesos

I noticed that if not all masters are up when a master or slave mesos
service is started, I get an error of the form:

F0729 05:45:55.244169  2019 zookeeper.cpp:103] Failed to create ZooKeeper,
zookeeper_init: No such file or directory [2]
Googling the error I found a previous related thread [1], in which Thomas
says that this happens when zookeeper is unable to resolve one of the
hostnames.
Indeed, when I changed the zk string to contain only masters that are up,
it worked fine.

My question is, how can this be a requirement? (and why?)
The whole point of zookeeper is to allow high-availability when some of the
masters are down, so naturally in such cases their hostnames will not be
resolved...
Is this something that occurs in mesos itself, or something in zookeeper?

[1]
http://mail-archives.apache.org/mod_mbox/mesos-user/201404.mbox/%3ccajrb3thcjbhd1bqjb0oevkqpawmst9-yxaqwrqo9rgft45x...@mail.gmail.com%3E

Reply via email to