Yes, your understanding is correct. I would like to predict it and control
the leader election by manual restart.

Thanks

On Wed, Aug 24, 2022, 4:51 AM Szalay-Bekő Máté <[email protected]>
wrote:

> Hello Ram,
>
> sorry, I don't really understand the question. The zxid is a 64 bit long
> number. The upper 32 bits are coding an election epoch number (a logical
> time / counter for leader elections), while the bottom 32 bits are counting
> / providing an auto incremented id for all the changes made (committed) in
> ZooKeeper. As far as I understood, the followers are sending proposals to
> the leader, and each accepted (committed) proposal will result in an
> increase in the zxid. The "current" / "latest" zxid is the same in the
> whole cluster (of course followers can lag behind a little, but not much in
> theory. if they are in-sync and part of the quorum).
>
> My understanding is that what you want to catch, is the event when the
> lower 32 bits of the zxid is approaching 0xffffffff . As when the last 32
> bits of the zxid is reaching 0xffffffff, then a new leader election will be
> triggered automatically and ZooKeeper won't be able to serve for a short
> period of time. And I guess you want to control this event and maybe
> restart the leader manually in a time what is suiting you better?
>
> But maybe I misunderstood your question.
>
> Máté
>
> On Tue, Aug 23, 2022 at 11:00 PM rammohan ganapavarapu <
> [email protected]> wrote:
>
> > Máté,
> >
> > Thanks for quick reply, yes i did see that srvr command can give the
> > current zxid, I also see a metric in mntr "proposal_count" which gives
> > total proposals and when we hit the zxid limit that is matching with the
> > proposal_count  2^32=*4,294,967,296)*metric. So i am trying to understand
> > how this zxid will get incitement ? I don't see zxid in logs for normal
> > events other than leader elections time.
> >
> > Ram
> >
> >
> >
> > On Tue, Aug 23, 2022 at 10:10 AM Szalay-Bekő Máté <
> > [email protected]> wrote:
> >
> > > Hello!
> > >
> > > I think the "srvr" 4-letter-word diagnostic command should print you
> the
> > > current zxid. Also the similar command works on the Admin Rest API (if
> it
> > > is enabled).
> > >
> > > See:
> > >
> >
> https://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_zkCommands
> > >
> > > An example:
> > >
> > >
> > > echo srvr | nc localhost 2181
> > >
> > > Zookeeper version: 3.5.5-136-69648f116c849ccd757e97c26d3450022d4b1dae,
> > > built on 08/08/2022 11:04 GMT
> > > Latency min/avg/max: 0/0/1808
> > > Received: 9599434
> > > Sent: 9673689
> > > Connections: 41
> > > Outstanding: 0
> > > Zxid: 0x2000afcbf                             <------------- this line
> > > Mode: leader
> > > Node count: 1384
> > > Proposal sizes last/min/max: 32/32/4226
> > >
> > >
> > >
> > >
> > > Also the zxid is added to the name of the snapshots / transaction log
> > > files, which are flushed to the file system. Like:  log.<zxid>  or
> > > snapshot.<zxid>
> > >
> > > e.g.:
> > >
> > > ls -la -R /var/lib/zookeeper/version-2/
> > >
> > > /var/lib/zookeeper/version-2/:
> > > total 57328
> > > drwxr-xr-x 2 zookeeper zookeeper     4096 Aug 23 10:42 .
> > > drwxr-x--- 3 zookeeper zookeeper     4096 Aug  9 10:41 ..
> > > -rw-r--r-- 1 zookeeper zookeeper        1 Aug 10 17:55 acceptedEpoch
> > > -rw-r--r-- 1 zookeeper zookeeper        1 Aug 10 17:55 currentEpoch
> > > -rw-r--r-- 1 zookeeper zookeeper 67108880 Aug 17 10:09 log.20004c9fc
> > > -rw-r--r-- 1 zookeeper zookeeper 67108880 Aug 19 00:37 log.20005a541
> > > -rw-r--r-- 1 zookeeper zookeeper 67108880 Aug 20 18:43 log.20006fc19
> > > -rw-r--r-- 1 zookeeper zookeeper 67108880 Aug 21 21:40 log.200087550
> > > -rw-r--r-- 1 zookeeper zookeeper 67108880 Aug 23 06:30 log.200096ed6
> > > -rw-r--r-- 1 zookeeper zookeeper 67108880 Aug 23 17:05 log.2000a9c57
> > > -rw-r--r-- 1 zookeeper zookeeper  1372956 Aug 17 10:09
> snapshot.20005a540
> > > -rw-r--r-- 1 zookeeper zookeeper  1370403 Aug 19 00:37
> snapshot.20006fc18
> > > -rw-r--r-- 1 zookeeper zookeeper  1369122 Aug 20 18:43
> snapshot.20008754f
> > > -rw-r--r-- 1 zookeeper zookeeper  1369034 Aug 21 21:40
> snapshot.200096ed4
> > > -rw-r--r-- 1 zookeeper zookeeper  1379613 Aug 23 06:30
> snapshot.2000a9c56
> > >
> > >
> > >
> > > Best regards,
> > > Máté
> > >
> > > On Tue, Aug 23, 2022 at 6:55 PM rammohan ganapavarapu <
> > > [email protected]> wrote:
> > >
> > > > Hi,
> > > >
> > > > We recently had a leader election due to "*zxid lower 32 bits have
> > rolled
> > > > over, forcing re-election*". This is the first time we are seeing
> this
> > > and
> > > > trying to understand how to find if the ensemble is reaching that
> > limit.
> > > > Are there any metrics available in zk to track this? How can I
> estimate
> > > > when my zk cluster will reach this limit?
> > > >
> > > > Thanks,
> > > > Ram
> > > >
> > >
> >
>

Reply via email to