Hello,

We are running 3.7.1 in production and running into an "issue" that the
names of sequence nodes are not unique after the counter hits the max int
(i.e 2147483647) and overflows.  I would like to start a thread to discuss
the following

1. Is this a bug or "expected" behavior?
2. Is ZK supposed to support the overflow scenario and need to make sure
the name is unique when overflow happens?

The name is not unique after hitting the max int value because of we
have the following in zk  code base:

1.  The cversion of parent znode is used to build the child name in
PrepRequestProcessor

        int parentCVersion = parentRecord.stat.getCversion();
        if (createMode.isSequential()) {
            path = path + String.format(Locale.ENGLISH, "%010d",
parentCVersion);
        }


https://github.com/apache/zookeeper/blob/master/zookeeper-server/src/main/
java/org/apache/zookeeper/server/PrepRequestProcessor.java#L668-L671


2. The parent znode is read from either zks.outstandingChangesForPath map
or zk database/datatree.

           lastChange = zks.outstandingChangesForPath.get(path);
            if (lastChange == null) {
                DataNode n = zks.getZKDatabase().getNode(path);


https://github.com/apache/zookeeper/blob/master/zookeeper-server/src/main/java/org/apache/zookeeper/server/PrepRequestProcessor.java#L168-L170



3. The cversion of the parent node in outstandingChangesForPath map is
always updated  but not in zk database as we added the following code in 3.6

            if (parentCVersion > parent.stat.getCversion()) {
                parent.stat.setCversion(parentCVersion);
                parent.stat.setPzxid(zxid);
            }

https://github.com/apache/zookeeper/blob/master/zookeeper-server/src/main/java/org/apache/zookeeper/server/DataTree.java#L477-L480

https://issues.apache.org/jira/browse/ZOOKEEPER-3249


When overflow happens, the new parentCversion is changed to -2147483648.
It's updated in the outstandingChangesForPath map. It's not updated in
DataTree and the value stays as 2147483647  because -2147483648 is less
than 2147483647, so the cVerson is inconsistent in  ZK.

Due to the inconsistent cVersion, when the next request comes in after
overflow, the sequence number is non-deterministic and not unique depending
on where the parent node is read from.  It can be 2147483647 if the
parent node is read from DataTree or -2147483648,  -2147483647 and so on if
it's from the outstandingChangesForPath map.

We have the following doc about unique naming but no info on  "expected"
behavior after overflow.

Sequence Nodes -- Unique Naming


When creating a znode you can also request that ZooKeeper append a
monotonically increasing counter to the end of path. This counter is unique
to the parent znode. The counter has a format of %010d -- that is 10 digits
with 0 (zero) padding (the counter is formatted in this way to simplify
sorting), i.e. "0000000001". See Queue Recipe for an example use of this
feature. Note: the counter used to store the next sequence number is a
signed int (4bytes) maintained by the parent node, the counter will
overflow when incremented beyond 2147483647 (resulting in a name
"-2147483648").



Please let me know if you have any comments or inputs.


Thanks,


Li

Reply via email to