Hello Kaushal, > 1. What is the algorithm used to elect the new leader between the remaining 2 followers?
There is a very high-level description of our internal ZooKeeper leader election algorithm here: https://zookeeper.apache.org/doc/current/zookeeperInternals.html#sc_leaderElection I don't know if we have more detailed documentation. If you are interested in the code, best to start here: https://github.com/apache/zookeeper/blob/master/zookeeper-server/src/main/java/org/apache/zookeeper/server/quorum/FastLeaderElection.java Also we have many unit tests around leader election that can help to understand the behaviour. > 2. During the leader elections process in place, does the client see a 503 service unavailable for all read or write requests? "503 service unavailable" is an HTTP error code, and on the ZooKeeper Client interface we don't use HTTP but we use a (jute based) binary protocol. In ZooKeeper, we have client sessions which can be kept alive for some time even if they can not communicate with the server. E.g. if you set client session timeout to 30 sec and there is a leader election in ZooKeeper server that takes e.g. 10 seconds, then (as far as I remember) the ZooKeeper client library should keep the session open so this should not be visible for the applications using ZooKeeper. Of course no change can be submitted (or no new session can be created) while the quorum has no active leader, so I assume these operations will be blocked until the internal leader election finishes in ZooKeeper. So one can expect longer response time temporarily in case of a leader election. > 3. In an ensemble of 3 nodes with 1 leader and 2 followers. Is there a way to see which node is serving read operations and which node is serving write operations? In ZooKeeper, the current leader is responsible to do all the modification on the data, and all the changes made by the leader are synchronized to all followers. The four-letter-word diagnostic interface ( https://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_4lw) or the HTTP admin API ( https://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_adminserver) can be used to find the current leader in the cluster. However, in ZooKeeper the clients can be connected to any ZooKeeper Server in the quorum (unless leaderServes config is explicitly disabled), and normally all servers will accept both read and write operations. A client session is handled by a server and if we send a write request, then this server will make sure to play it through the current leader before sending back the answer to the client. The client doesn't need to know who is the current leader, it can communicate to any server. Usually we list all the ZooKeeper servers when we initiate a new client session, so the client library can fail-over and loadbalance. In general, you might find useful to read our documentation: https://zookeeper.apache.org/doc/current/zookeeperOver.html Kind regards, Máté On Sat, Sep 17, 2022 at 6:27 PM Steph van Schalkwyk <svanschalk...@gmail.com> wrote: > Just google leader election site:zookeeper.apache.org > > > On Fri, Sep 16, 2022 at 7:39 PM Kaushal Shriyan <kaushalshri...@gmail.com> > wrote: > > > Hi, > > > > I am running Zookeeper version: 3.7.0 ( 3 nodes -> 1 Leader and 2 > > Followers) on CentOS Linux release 7.9.2009 (Core). In an ensemble of 3 > > nodes with 1 leader and 2 followers, if the leader goes down then two > > servers can elect a leader among themselves. I have the below questions. > > > > 1. What is the algorithm used to elect the new leader between the > > remaining 2 followers? > > 2. During the leader elections process in place, does the client see a > > 503 service unavailable for all read or write requests? > > 3. In an ensemble of 3 nodes with 1 leader and 2 followers. Is there a > > way to see which node is serving read operations and which node is > > serving > > write operations? > > > > Please guide me. Any help will be highly appreciable. Thanks in advance. > > > > Best Regards, > > > > Kaushal > > >