Hi, I experimented today running mesos masters & slaves with multiple masters using zookeeper, by editing the /etc/mesos/zk file on all nodes (masters and slaves) to something like: zk://master1:2181,master2:2181,master3:2181/mesos
I noticed that if not all masters are up when a master or slave mesos service is started, I get an error of the form: F0729 05:45:55.244169 2019 zookeeper.cpp:103] Failed to create ZooKeeper, zookeeper_init: No such file or directory [2] Googling the error I found a previous related thread [1], in which Thomas says that this happens when zookeeper is unable to resolve one of the hostnames. Indeed, when I changed the zk string to contain only masters that are up, it worked fine. My question is, how can this be a requirement? (and why?) The whole point of zookeeper is to allow high-availability when some of the masters are down, so naturally in such cases their hostnames will not be resolved... Is this something that occurs in mesos itself, or something in zookeeper? [1] http://mail-archives.apache.org/mod_mbox/mesos-user/201404.mbox/%3ccajrb3thcjbhd1bqjb0oevkqpawmst9-yxaqwrqo9rgft45x...@mail.gmail.com%3E