Hello Ram, sorry, I don't really understand the question. The zxid is a 64 bit long number. The upper 32 bits are coding an election epoch number (a logical time / counter for leader elections), while the bottom 32 bits are counting / providing an auto incremented id for all the changes made (committed) in ZooKeeper. As far as I understood, the followers are sending proposals to the leader, and each accepted (committed) proposal will result in an increase in the zxid. The "current" / "latest" zxid is the same in the whole cluster (of course followers can lag behind a little, but not much in theory. if they are in-sync and part of the quorum).
My understanding is that what you want to catch, is the event when the lower 32 bits of the zxid is approaching 0xffffffff . As when the last 32 bits of the zxid is reaching 0xffffffff, then a new leader election will be triggered automatically and ZooKeeper won't be able to serve for a short period of time. And I guess you want to control this event and maybe restart the leader manually in a time what is suiting you better? But maybe I misunderstood your question. Máté On Tue, Aug 23, 2022 at 11:00 PM rammohan ganapavarapu < [email protected]> wrote: > Máté, > > Thanks for quick reply, yes i did see that srvr command can give the > current zxid, I also see a metric in mntr "proposal_count" which gives > total proposals and when we hit the zxid limit that is matching with the > proposal_count 2^32=*4,294,967,296)*metric. So i am trying to understand > how this zxid will get incitement ? I don't see zxid in logs for normal > events other than leader elections time. > > Ram > > > > On Tue, Aug 23, 2022 at 10:10 AM Szalay-Bekő Máté < > [email protected]> wrote: > > > Hello! > > > > I think the "srvr" 4-letter-word diagnostic command should print you the > > current zxid. Also the similar command works on the Admin Rest API (if it > > is enabled). > > > > See: > > > https://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_zkCommands > > > > An example: > > > > > > echo srvr | nc localhost 2181 > > > > Zookeeper version: 3.5.5-136-69648f116c849ccd757e97c26d3450022d4b1dae, > > built on 08/08/2022 11:04 GMT > > Latency min/avg/max: 0/0/1808 > > Received: 9599434 > > Sent: 9673689 > > Connections: 41 > > Outstanding: 0 > > Zxid: 0x2000afcbf <------------- this line > > Mode: leader > > Node count: 1384 > > Proposal sizes last/min/max: 32/32/4226 > > > > > > > > > > Also the zxid is added to the name of the snapshots / transaction log > > files, which are flushed to the file system. Like: log.<zxid> or > > snapshot.<zxid> > > > > e.g.: > > > > ls -la -R /var/lib/zookeeper/version-2/ > > > > /var/lib/zookeeper/version-2/: > > total 57328 > > drwxr-xr-x 2 zookeeper zookeeper 4096 Aug 23 10:42 . > > drwxr-x--- 3 zookeeper zookeeper 4096 Aug 9 10:41 .. > > -rw-r--r-- 1 zookeeper zookeeper 1 Aug 10 17:55 acceptedEpoch > > -rw-r--r-- 1 zookeeper zookeeper 1 Aug 10 17:55 currentEpoch > > -rw-r--r-- 1 zookeeper zookeeper 67108880 Aug 17 10:09 log.20004c9fc > > -rw-r--r-- 1 zookeeper zookeeper 67108880 Aug 19 00:37 log.20005a541 > > -rw-r--r-- 1 zookeeper zookeeper 67108880 Aug 20 18:43 log.20006fc19 > > -rw-r--r-- 1 zookeeper zookeeper 67108880 Aug 21 21:40 log.200087550 > > -rw-r--r-- 1 zookeeper zookeeper 67108880 Aug 23 06:30 log.200096ed6 > > -rw-r--r-- 1 zookeeper zookeeper 67108880 Aug 23 17:05 log.2000a9c57 > > -rw-r--r-- 1 zookeeper zookeeper 1372956 Aug 17 10:09 snapshot.20005a540 > > -rw-r--r-- 1 zookeeper zookeeper 1370403 Aug 19 00:37 snapshot.20006fc18 > > -rw-r--r-- 1 zookeeper zookeeper 1369122 Aug 20 18:43 snapshot.20008754f > > -rw-r--r-- 1 zookeeper zookeeper 1369034 Aug 21 21:40 snapshot.200096ed4 > > -rw-r--r-- 1 zookeeper zookeeper 1379613 Aug 23 06:30 snapshot.2000a9c56 > > > > > > > > Best regards, > > Máté > > > > On Tue, Aug 23, 2022 at 6:55 PM rammohan ganapavarapu < > > [email protected]> wrote: > > > > > Hi, > > > > > > We recently had a leader election due to "*zxid lower 32 bits have > rolled > > > over, forcing re-election*". This is the first time we are seeing this > > and > > > trying to understand how to find if the ensemble is reaching that > limit. > > > Are there any metrics available in zk to track this? How can I estimate > > > when my zk cluster will reach this limit? > > > > > > Thanks, > > > Ram > > > > > >
