Thanks for the response, Enrico.

Please see comments below.

On Thu, Jun 15, 2023 at 5:47 AM Enrico Olivelli <[email protected]> wrote:

> Li,
> thanks for reporting your problem.
>
> Most likely you have found a bug.
>
> I have one question, related to your use case,
> is the problem that the numbers are not "unique" or that the number is
> not monotonically increasing ?
>

Technically speaking, monotonically increasing means either always
increasing or *remaining constant. *With tha*t, * the problem is only
the numbers are not "unique'. In this case, the parent cversion
remains 2147483647
after reaching Integer.MAX_VALUE, not unique any more.

>
> Do you have 2147483647 concurrent sessions and you found that two
> sessions got the same sequenceId ?
> or are you storing the sequenceId somewhere and you use it as a
> globally unique id, not only among the connected sessions but also
> among all the sessions that are ever connected to the cluster ?
>

In our use case, we create *persistent* *sequential* nodes. We store the
sequence id in the client application and use it as a globally unique id.
Currently Zookeeper guarantees the following non-overflow case but not
after overflow.

1. Monotonically increasing
2. Uniqueness
3. Sequentially increase by 1

For customers who have an overflow use case and can handle the sequence
number cycling in a circular fashion, how about having a simple patch
to support it and handle the overflow case better?  The change is adding a
condition to allow the wraparound when it flows to negative. We can also
have a property to control whether to add the additional condition if
needed.

New
===
if (parentCVersion > currentParentCVersion
                *|| parentCVersion == Integer.MIN_VALUE &&
currentParentCVersion == Integer.MAX_VALUE) *{
                parent.stat.setCversion(parentCVersion);
                parent.stat.setPzxid(zxid);
          }

Current
=====
if (parentCVersion > parent.stat.getCversion()) {
                parent.stat.setCversion(parentCVersion);
                parent.stat.setPzxid(zxid);
            }

Please let me know what you think. Any input or feedback would be
appreciated.

Thanks,

Li


> Enrico
>
> Il giorno ven 9 giu 2023 alle ore 21:10 Li Wang <[email protected]> ha
> scritto:
> >
> > Hello,
> >
> > We are running 3.7.1 in production and running into an "issue" that the
> > names of sequence nodes are not unique after the counter hits the max int
> > (i.e 2147483647) and overflows.  I would like to start a thread to
> discuss
> > the following
> >
> > 1. Is this a bug or "expected" behavior?
> > 2. Is ZK supposed to support the overflow scenario and need to make sure
> > the name is unique when overflow happens?
> >
> > The name is not unique after hitting the max int value because of we
> > have the following in zk  code base:
> >
> > 1.  The cversion of parent znode is used to build the child name in
> > PrepRequestProcessor
> >
> >         int parentCVersion = parentRecord.stat.getCversion();
> >         if (createMode.isSequential()) {
> >             path = path + String.format(Locale.ENGLISH, "%010d",
> > parentCVersion);
> >         }
> >
> >
> >
> https://github.com/apache/zookeeper/blob/master/zookeeper-server/src/main/
> > java/org/apache/zookeeper/server/PrepRequestProcessor.java#L668-L671
> >
> >
> > 2. The parent znode is read from either zks.outstandingChangesForPath map
> > or zk database/datatree.
> >
> >            lastChange = zks.outstandingChangesForPath.get(path);
> >             if (lastChange == null) {
> >                 DataNode n = zks.getZKDatabase().getNode(path);
> >
> >
> >
> https://github.com/apache/zookeeper/blob/master/zookeeper-server/src/main/java/org/apache/zookeeper/server/PrepRequestProcessor.java#L168-L170
> >
> >
> >
> > 3. The cversion of the parent node in outstandingChangesForPath map is
> > always updated  but not in zk database as we added the following code in
> 3.6
> >
> >             if (parentCVersion > parent.stat.getCversion()) {
> >                 parent.stat.setCversion(parentCVersion);
> >                 parent.stat.setPzxid(zxid);
> >             }
> >
> >
> https://github.com/apache/zookeeper/blob/master/zookeeper-server/src/main/java/org/apache/zookeeper/server/DataTree.java#L477-L480
> >
> > https://issues.apache.org/jira/browse/ZOOKEEPER-3249
> >
> >
> > When overflow happens, the new parentCversion is changed to -2147483648.
> > It's updated in the outstandingChangesForPath map. It's not updated in
> > DataTree and the value stays as 2147483647  because -2147483648 is less
> > than 2147483647, so the cVerson is inconsistent in  ZK.
> >
> > Due to the inconsistent cVersion, when the next request comes in after
> > overflow, the sequence number is non-deterministic and not unique
> depending
> > on where the parent node is read from.  It can be 2147483647 if the
> > parent node is read from DataTree or -2147483648,  -2147483647 and so on
> if
> > it's from the outstandingChangesForPath map.
> >
> > We have the following doc about unique naming but no info on  "expected"
> > behavior after overflow.
> >
> > Sequence Nodes -- Unique Naming
> >
> >
> > When creating a znode you can also request that ZooKeeper append a
> > monotonically increasing counter to the end of path. This counter is
> unique
> > to the parent znode. The counter has a format of %010d -- that is 10
> digits
> > with 0 (zero) padding (the counter is formatted in this way to simplify
> > sorting), i.e. "0000000001". See Queue Recipe for an example use of this
> > feature. Note: the counter used to store the next sequence number is a
> > signed int (4bytes) maintained by the parent node, the counter will
> > overflow when incremented beyond 2147483647 (resulting in a name
> > "-2147483648").
> >
> >
> >
> > Please let me know if you have any comments or inputs.
> >
> >
> > Thanks,
> >
> >
> > Li
>

Reply via email to