Hello, We have a service that runs on 3 hosts for high availability. However, at any given time, exactly one instance must be active. So, we are thinking to use Leader election using Zookeeper. To this goal, on each service host we also start a ZK server, so we have a 3-nodes ZK cluster and each service instance is a client to its dedicated ZK server. Then, we implement a leader election on top of Zookeeper using a basic recipe: https://zookeeper.apache.org/doc/r3.1.2/recipes.html#sc_leaderElection.
I have the following questions doubts regarding the approach: 1. It seems like we can run into inconsistency issues when network partition occurs. Zookeeper documentation says that the inconsistency period may last “tens of seconds”. Am I understanding correctly that during this time we may have 0 or 2 leaders? 2. Is it possible to reduce this inconsistency time (let's say to 3 seconds) by tweaking tickTime and syncLimit parameters? 3. Is there a way to guarantee exactly one leader all the time? Should we implement a more complex leader election algorithm than the one suggested in the recipe (using ephemeral_sequential nodes)? Thanks, Michael.