[jira] [Created] (ZOOKEEPER-4884) FastLeaderElection WorkerSender/WorkerReceiver don't need to be Thread
Kezhu Wang created ZOOKEEPER-4884: - Summary: FastLeaderElection WorkerSender/WorkerReceiver don't need to be Thread Key: ZOOKEEPER-4884 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4884 Project: ZooKeeper Issue Type: Improvement Components: server Reporter: Kezhu Wang ZOOKEEPER-1810 replaced them with dedicated threads. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (ZOOKEEPER-4883) Rollover leader epoch when counter part of zxid reach limit
Kezhu Wang created ZOOKEEPER-4883: - Summary: Rollover leader epoch when counter part of zxid reach limit Key: ZOOKEEPER-4883 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4883 Project: ZooKeeper Issue Type: Improvement Components: server Reporter: Kezhu Wang Currently, zxid rollover will cause re-election(ZOOKEEPER-1277) which is time consuming. ZOOKEEPER-2789 proposes to use 24 bits for epoch and 40 bits for counter. I do think it is promising as [it promotes rollover rate from 49.7 days to 34.9 years assuming 1k/s ops|https://github.com/apache/zookeeper/pull/2164#issuecomment-2368107479]. But I think it is a one-way ticket. And the change of data format may require community wide spread to upgrade third party libraries/tools if they are ever tied to this. Inside ZooKeeper, `accepetedEpoch` and `currentEpoch` are tied to `zxid`. Given a snapshot and a txn log, we need probably deduced those two epoch values to join quorum. So, I presents alternative solution to rollover leader epoch when counter part of zxid reach limit. # Treats last proposal of an epoch as rollover proposal. # Requests from next epoch are proposed normally. # Fences next epoch once rollover proposal persisted. # Proposals from next epoch will not be written to disk before rollover committed. # Leader commits rollover proposal once it get quorum ACKs. # Blocked new epoch proposals are logged once rollover proposal is committed in corresponding nodes. This results in: # No other lead cloud lead using next epoch number once rollover proposal is considered committed. # No proposals from next epoch will be written to disk before rollover proposal is considered committed. Here is the branch, I will draft a pr later. https://github.com/kezhuw/zookeeper/tree/zxid-rollover -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (ZOOKEEPER-4882) Data loss after restarting an node experienced temporary disk error and rejoin
Kezhu Wang created ZOOKEEPER-4882: - Summary: Data loss after restarting an node experienced temporary disk error and rejoin Key: ZOOKEEPER-4882 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4882 Project: ZooKeeper Issue Type: Bug Components: server Affects Versions: 3.9.3, 3.8.4 Reporter: Kezhu Wang The cause is multifold: 1. Leader will commit a proposal once quorum acked. 2. Proposal is able to be committed in node's memory even if it has not been written to that node's disk. 3. In case of disk error, the txn log could lag behind memory database. The above applies to both leader and follower. I have not verified leader branch, let's consider only follower for now. f4. A follower experienced temporary disk error will have hole in txn log after re-join. f5. Restarted follower will lose the data. Worse, it is able to win election and propagate data loss to whole cluster. I authored commits in my repo to expose this. https://github.com/kezhuw/zookeeper/commits/data-loss-temporary-sync-disk-error/ -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (ZOOKEEPER-4881) Support non blocking ZooKeeper::close
Kezhu Wang created ZOOKEEPER-4881: - Summary: Support non blocking ZooKeeper::close Key: ZOOKEEPER-4881 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4881 Project: ZooKeeper Issue Type: New Feature Reporter: Kezhu Wang {{ZooKeeper::close}} is synchronous, it waits until {{OpCode.closeSession}} returned. I think it would be useful to support background closing to avoid block caller, just like `close(2)` for socket. We probably could use {{ZooKeeperBuilder}} from ZOOKEEPER-4697 to manually enable it. This way we would not surprise anyone just like `SO_LINGER` for socket. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (ZOOKEEPER-4875) Support pre-constructed ZKConfig in server side
Kezhu Wang created ZOOKEEPER-4875: - Summary: Support pre-constructed ZKConfig in server side Key: ZOOKEEPER-4875 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4875 Project: ZooKeeper Issue Type: Improvement Components: server Reporter: Kezhu Wang Assignee: Kezhu Wang Currently, `ZKConfig` is only constructed right before its usage. It makes it hard to run multiple zookeeper servers in one JVM but with different configurations. Though, we don't officially claim support to that, but I think it should be a good to not have such a ban in our side. Also, accepting pre-constructed `ZKConfig` clould also benefit tests to not mess up properties between client and server. See also https://github.com/apache/zookeeper/pull/2200#discussion_r1800328858 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (ZOOKEEPER-4870) Proactive leadership transfer
Kezhu Wang created ZOOKEEPER-4870: - Summary: Proactive leadership transfer Key: ZOOKEEPER-4870 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4870 Project: ZooKeeper Issue Type: New Feature Components: java client, server Reporter: Kezhu Wang We do have leadership transfer, but it only happen when we are removing leader in reconfiguration. It would be nice to support it with dedicated API. This way it will be really useful to reduce unavailability during rolling upgrade or leader shutdown. Also, I think it cloud also help zxid rollover. Inheriting leadership in rollover should be similar to leadership transfer in protocol. https://www.usenix.org/conference/atc12/technical-sessions/presentation/shraer {quote} we investigate the effect of reconfigurations removing the leader. Note that a server can never be added to a cluster as leader as we always prioritize the current leader. Figure 8 shows the advan- tage of designating a new leader when removing the cur- rent one, and thus avoiding leader election. It depicts the average time to recover from a leader crash versus the average time to regain system availability following the removal of the leader. The average is taken on 10 executions. We can see that designating a default leader saves up to 1sec, depending on the cluster size. As cluster size increases, leader election takes longer while using a default leader takes constant time regardless of the clus- ter size. Nevertheless, as the figure shows, cluster size always affects total leader recovery time, as it includes synchronizing state with a quorum of followers. {quote} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (ZOOKEEPER-4859) C client tests hang to be cancelled quite often
Kezhu Wang created ZOOKEEPER-4859: - Summary: C client tests hang to be cancelled quite often Key: ZOOKEEPER-4859 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4859 Project: ZooKeeper Issue Type: Test Components: c client, tests Reporter: Kezhu Wang Assignee: Kezhu Wang CPPUNIT tests runs sequentially. By comparing logs with successful run, I think it could be {{Zookeeper_readOnly::testReadOnlyWithSSL}}. Logs from failed run. {noformat} [exec] Zookeeper_multi::testWatch : elapsed 2005 : OK [exec] Zookeeper_multi::testSequentialNodeCreateInAsyncMulti : elapsed 2001 : OK [exec] Zookeeper_multi::testBigAsyncMulti : elapsed 3003 : OK [exec] Zookeeper_operations::testAsyncWatcher1 : elapsed 54 : OK [exec] Zookeeper_operations::testAsyncGetOperation : elapsed 54 : OK [exec] Zookeeper_operations::testOperationsAndDisconnectConcurrently1 : elapsed 382 : OK [exec] Zookeeper_operations::testOperationsAndDisconnectConcurrently2 : elapsed 0 : OK [exec] Zookeeper_operations::testConcurrentOperations1 : elapsed 21127 : OK [exec] Zookeeper_readOnly::testReadOnly ZooKeeper server started ZooKeeper server started : elapsed 12214 : OK {noformat} Logs from successful run. {noformat} [exec] Zookeeper_multi::testCheck : elapsed 1007 : OK [exec] Zookeeper_multi::testWatch : elapsed 2006 : OK [exec] Zookeeper_multi::testSequentialNodeCreateInAsyncMulti : elapsed 2001 : OK [exec] Zookeeper_multi::testBigAsyncMulti : elapsed 3003 : OK [exec] Zookeeper_operations::testAsyncWatcher1 : elapsed 54 : OK [exec] Zookeeper_operations::testAsyncGetOperation : elapsed 54 : OK [exec] Zookeeper_operations::testOperationsAndDisconnectConcurrently1 : elapsed 387 : OK [exec] Zookeeper_operations::testOperationsAndDisconnectConcurrently2 : elapsed 0 : OK [exec] Zookeeper_operations::testConcurrentOperations1 : elapsed 22459 : OK [exec] Zookeeper_readOnly::testReadOnly ZooKeeper server started ZooKeeper server started : elapsed 15515 : OK [exec] Zookeeper_readOnly::testReadOnlyWithSSL ZooKeeper server started ZooKeeper server started : elapsed 14865 : OK {noformat} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (ZOOKEEPER-4853) erroneous assertion in ZooKeeperQuotaTest#testQuota
Kezhu Wang created ZOOKEEPER-4853: - Summary: erroneous assertion in ZooKeeperQuotaTest#testQuota Key: ZOOKEEPER-4853 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4853 Project: ZooKeeper Issue Type: Bug Components: tests Affects Versions: 3.9.2 Reporter: Kezhu Wang Fix For: 3.10.0 Created for https://github.com/apache/zookeeper/pull/2169. {code:java} assertNotNull(server.getZKDatabase().getDataTree().getMaxPrefixWithQuota(path) != null, "Quota is still set"); {code} The above equation test evaluates to a non null boolean value, so always pass. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (ZOOKEEPER-4848) Possible stack overflow in setup_random
Kezhu Wang created ZOOKEEPER-4848: - Summary: Possible stack overflow in setup_random Key: ZOOKEEPER-4848 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4848 Project: ZooKeeper Issue Type: Bug Components: c client Affects Versions: 3.9.2, 3.8.4 Reporter: Kezhu Wang Created for https://github.com/apache/zookeeper/pull/2097. {code:c} int seed_len = 0; /* Enter a loop to fill in seed with random data from /dev/urandom. * This is done in a loop so that we can safely handle short reads * which can happen due to signal interruptions. */ while (seed_len < sizeof(seed)) { /* Assert we either read something or we were interrupted due to a * signal (errno == EINTR) in which case we need to retry. */ int rc = read(fd, &seed + seed_len, sizeof(seed) - seed_len); assert(rc > 0 || errno == EINTR); if (rc > 0) { seed_len += rc; } } {code} Above code will overflow {{seed}} in case of a short read. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (ZOOKEEPER-4833) Bring back ARM64 to Github Actions using macos-latest
Kezhu Wang created ZOOKEEPER-4833: - Summary: Bring back ARM64 to Github Actions using macos-latest Key: ZOOKEEPER-4833 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4833 Project: ZooKeeper Issue Type: Improvement Components: tests Reporter: Kezhu Wang ARM64 ci was added in ZOOKEEPER-3919 but lost in ZOOKEEPER-4642 as a consequence to migration to github actions. Now github actions has runner [{{macos-latest}} on {{M1}}|https://docs.github.com/en/actions/using-github-hosted-runners/about-github-hosted-runners/about-github-hosted-runners#standard-github-hosted-runners-for-public-repositories]. We can bring it back. I think it is valuable for ZooKeeper as we have {{zookeeper-client-c}}. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (ZOOKEEPER-4821) ConnectRequest got NOTREADONLY ReplyHeader
Kezhu Wang created ZOOKEEPER-4821: - Summary: ConnectRequest got NOTREADONLY ReplyHeader Key: ZOOKEEPER-4821 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4821 Project: ZooKeeper Issue Type: Bug Components: java client, server Affects Versions: 3.9.2, 3.8.4 Reporter: Kezhu Wang I would expect {{ConnectRequest}} has two kinds of response in normal conditions: {{ConnectResponse}} and socket close. But if sever was configured with {{readonlymode.enabled}} but not {{localSessionsEnabled}}, then client could get {{NOTREADONLY}} in reply to {{ConnectRequest}}. I saw, at least, no handling in java client. And, I encountered this in writing tests for rust client. It guess it is not by design. And we probably could close the socket in early phase. But also, it could be solved in client sides as {{sizeof(ConnectResponse)}} is larger than {{sizeof(ReplyHeader)}}. Then, we gain ability to carry error for {{ConnectRequest}} while {{ConnectResponse}} does not. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (ZOOKEEPER-4819) Can't seek for writable tls server if connected to readonly server
Kezhu Wang created ZOOKEEPER-4819: - Summary: Can't seek for writable tls server if connected to readonly server Key: ZOOKEEPER-4819 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4819 Project: ZooKeeper Issue Type: Bug Components: java client Affects Versions: 3.9.2, 3.8.4 Reporter: Kezhu Wang {{[ClientCnxn::pingRwServer|https://github.com/apache/zookeeper/blob/d12aba599233b0fcba0b9b945ed3d2f45d4016f0/zookeeper-server/src/main/java/org/apache/zookeeper/ClientCnxn.java#L1280]}} uses raw socket to issue "isro" 4lw command. This results in unsuccessful handshake to tls server. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (ZOOKEEPER-4750) RequestPathMetricsCollector does not align with FinalRequestProcessor
Kezhu Wang created ZOOKEEPER-4750: - Summary: RequestPathMetricsCollector does not align with FinalRequestProcessor Key: ZOOKEEPER-4750 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4750 Project: ZooKeeper Issue Type: Bug Components: server Affects Versions: 3.9.0 Reporter: Kezhu Wang For example, it does not handle {{createTTL}}. {noformat} 2023-09-30 17:46:59,212 [myid:] - ERROR [SyncThread:0:o.a.z.s.u.RequestPathMetricsCollector@216] - We should not handle 21 {noformat} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (ZOOKEEPER-4749) Request timeout is not respected for asynchronous api
Kezhu Wang created ZOOKEEPER-4749: - Summary: Request timeout is not respected for asynchronous api Key: ZOOKEEPER-4749 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4749 Project: ZooKeeper Issue Type: Bug Components: java client Affects Versions: 3.9.0 Reporter: Kezhu Wang "zookeeper.request.timeout" is only consulted in synchronous code path. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (ZOOKEEPER-4747) Java api lacks synchronous version of sync() call
Kezhu Wang created ZOOKEEPER-4747: - Summary: Java api lacks synchronous version of sync() call Key: ZOOKEEPER-4747 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4747 Project: ZooKeeper Issue Type: New Feature Components: java client Reporter: Kezhu Wang Assignee: Kezhu Wang Fix For: 3.10.0 Ideally, it should be redundant just as what [~breed] says in ZOOKEEPER-1167. {quote} it wasn't an oversight. there is no reason for a synchronous version. because of the ordering guarantees, if you issue an asynchronous sync, the next call, whether synchronous or asynchronous will see the updated state. {quote} But in case of connection loss and absent of ZOOKEEPER-22, client has to check result of asynchronous sync before next call. So, currently, we can't simply issue an fire-and-forget asynchronous sync and an read to gain strong consistent. Then in a synchronous call chain, client has to convert asynchronous {{sync}} to synchronous to gain strong consistent. This is what I do in [EagerACLFilterTest::syncClient|https://github.com/apache/zookeeper/blob/f42c01de73867ffbc12707b3e9f9cd7f847fe462/zookeeper-server/src/test/java/org/apache/zookeeper/server/quorum/EagerACLFilterTest.java#L98], it is apparently unfriendly to end users. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (ZOOKEEPER-4746) cppunit tests hang and cancelled
Kezhu Wang created ZOOKEEPER-4746: - Summary: cppunit tests hang and cancelled Key: ZOOKEEPER-4746 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4746 Project: ZooKeeper Issue Type: Test Components: tests Affects Versions: 3.10.0 Reporter: Kezhu Wang * https://github.com/apache/zookeeper/actions/runs/6007712384/job/16337953123 * https://github.com/apache/zookeeper/actions/runs/6047057349/job/16409786315 * https://github.com/apache/zookeeper/actions/runs/6195151365/job/16819317479 * https://github.com/apache/zookeeper/actions/runs/6196548582/job/16823409398 Hang too long to be cancelled by runner. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (ZOOKEEPER-4745) End to End tests fail occasionally
Kezhu Wang created ZOOKEEPER-4745: - Summary: End to End tests fail occasionally Key: ZOOKEEPER-4745 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4745 Project: ZooKeeper Issue Type: Test Components: tests Affects Versions: 3.10.0 Reporter: Kezhu Wang I saw: * https://github.com/apache/zookeeper/actions/runs/5587157838/job/15131211778 * https://github.com/kezhuw/zookeeper/actions/runs/5251205631/job/14209201285 * https://github.com/kezhuw/zookeeper/actions/runs/6198985701/job/16830576384#step:9:38 * https://github.com/apache/zookeeper/actions/runs/6244974218/job/16952757583#step:11:44 {noformat} 2023-07-18 12:08:34,046 [myid:] - ERROR [main:o.a.z.u.ServiceUtils@48] - Exiting JVM with code 1 ZooKeeper JMX enabled by default Using config: /home/runner/work/zookeeper/zookeeper/apache-zookeeper-3.7.0-bin/bin/../conf/zoo_sample.cfg Stopping zookeeper ... STOPPED Traceback (most recent call last): File "/home/runner/work/zookeeper/zookeeper/tools/ci/test-connectivity.py", line 48, in subprocess.run([f'{client_binpath}', 'sync', '/'], check=True) File "/usr/lib/python3.10/subprocess.py", line 524, in run raise CalledProcessError(retcode, process.args, subprocess.CalledProcessError: Command '['/home/runner/work/zookeeper/zookeeper/bin/zkCli.sh', 'sync', '/']' returned non-zero exit status 1. Error: Process completed with exit code 1. {noformat} I guess it could cause by asynchronous start. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (ZOOKEEPER-4742) Config watch path get truncated abnormally for chroot "/zoo" or alikes
Kezhu Wang created ZOOKEEPER-4742: - Summary: Config watch path get truncated abnormally for chroot "/zoo" or alikes Key: ZOOKEEPER-4742 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4742 Project: ZooKeeper Issue Type: Bug Components: java client Affects Versions: 3.8.2, 3.9.0, 3.7.1 Reporter: Kezhu Wang Assignee: Kezhu Wang This is a leftover of ZOOKEEPER-4565 and splitted from [pr#1996|https://github.com/apache/zookeeper/pull/1996] to make ZOOKEEPER-4601 concentrate. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (ZOOKEEPER-4704) Flaky test: ReconfigRollingRestartCompatibilityTest.testRollingRestartWithExtendedMembershipConfig
Kezhu Wang created ZOOKEEPER-4704: - Summary: Flaky test: ReconfigRollingRestartCompatibilityTest.testRollingRestartWithExtendedMembershipConfig Key: ZOOKEEPER-4704 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4704 Project: ZooKeeper Issue Type: Bug Components: tests Affects Versions: 3.8.1, 3.7.1 Reporter: Kezhu Wang https://github.com/apache/zookeeper/actions/runs/5229706434/jobs/9442845028#step:7:619 {quote} org.opentest4j.AssertionFailedError: waiting for server 2 being up ==> expected: but was: at org.apache.zookeeper.server.quorum.ReconfigRollingRestartCompatibilityTest.testRollingRestartWithExtendedMembershipConfig(ReconfigRollingRestartCompatibilityTest.java:263) {quote} Same commit pass in my fork https://github.com/kezhuw/zookeeper/actions/runs/5229684461. And an independent report appears in [ZOOKEEPER-4628|https://issues.apache.org/jira/browse/ZOOKEEPER-4628?focusedCommentId=17632044&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17632044], unfortunate, the run cache has been cleared. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (ZOOKEEPER-4702) State change events during close is indeterminate
Kezhu Wang created ZOOKEEPER-4702: - Summary: State change events during close is indeterminate Key: ZOOKEEPER-4702 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4702 Project: ZooKeeper Issue Type: Improvement Reporter: Kezhu Wang Placing {{sleep(100)}} before {{disconnect}} in {{ClientCnxn::close}} could easily reproduce this. {{KeeperState.Disconnected}} is not always issued. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (ZOOKEEPER-4698) Persistent watch events lost after reconnection
Kezhu Wang created ZOOKEEPER-4698: - Summary: Persistent watch events lost after reconnection Key: ZOOKEEPER-4698 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4698 Project: ZooKeeper Issue Type: Bug Components: server Affects Versions: 3.8.1, 3.7.1 Reporter: Kezhu Wang I found this in reply to [apache#1950 (comment)|https://github.com/apache/zookeeper/pull/1950#issuecomment-1553742525]. But it turns out a known issue [apache#1106 (comment)|https://github.com/apache/zookeeper/pull/1106#issuecomment-543860329]. I think it is worth to note separately in jira for potential future discussions and fix. I have pushed a [test case|https://github.com/kezhuw/zookeeper/commit/31d89e9829380559066fc2b83e3d38462380c5d4] for this. It fails as expected. {noformat} [ERROR] Failures: [ERROR] WatchEventWhenAutoResetTest.testPersistentRecursiveWatch:237 do not receive a NodeDataChanged ==> expected: not [ERROR] WatchEventWhenAutoResetTest.testPersistentWatch:211 do not receive a NodeDataChanged ==> expected: not {noformat} It is hard to fix this with sole {{DataTree}}. Two independent comments [pointed|https://github.com/apache/zookeeper/pull/1106#issuecomment-1366449561] [out|https://github.com/kezhuw/zookeeper/commit/31d89e9829380559066fc2b83e3d38462380c5d4#diff-cfd09b7021c88da6631872e8a4a271f830162f7c5a63a140839ba029048493fdR227-R230] this. I guess we have to walk through txn log to deliver a correct fix. {quote} Watches will not be received while disconnected from a server. When a client reconnects, any previously registered watches will be reregistered and triggered if needed. In general this all occurs transparently. There is one case where a watch may be missed: a watch for the existence of a znode not yet created will be missed if the znode is created and deleted while disconnected. {quote} This is [what our programer's guide says|https://zookeeper.apache.org/doc/r3.8.1/zookeeperProgrammers.html#ch_zkWatches]. It is well-known, at least for me, that we can lose some transiently intermediate events in reconnection. But in case of persistent watch, we can lose more. This forces clients to rebuild their knowledge on reconnection. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (ZOOKEEPER-4697) Add Builder to construct ZooKeeper and possible descendants
Kezhu Wang created ZOOKEEPER-4697: - Summary: Add Builder to construct ZooKeeper and possible descendants Key: ZOOKEEPER-4697 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4697 Project: ZooKeeper Issue Type: New Feature Components: java client Reporter: Kezhu Wang We have 10 constructor variants for {{ZooKeeper}} and 4 for {{ZooKeeperAdmin}} now. It is enough for us to resort to a builder or others. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (ZOOKEEPER-4695) Forbid multiple mutations of one key in multi
Kezhu Wang created ZOOKEEPER-4695: - Summary: Forbid multiple mutations of one key in multi Key: ZOOKEEPER-4695 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4695 Project: ZooKeeper Issue Type: Wish Components: c client, java client, server Reporter: Kezhu Wang I guess this might be part of ZOOKEEPER-1289, but let me list this as separated issue for now till we got a solution for ZOOKEEPER-1289 or other committers close this as "Duplicate". Currently, when there are multiple mutations for one key in multi, there will be multiple watch events delivered for that key. Let's assume {{delete}} and {{create}} for key "/foo" in a {{multi}} operation. Client will receive two events with {{NodeDeleted}} and {{NodeCreated}}, then client will expose a state that "/foo" does not exist. But in normal reads, we should never encounter such a state as {{multi}} should behave atomic. It is absolutely a breaking change as there is new failure path or code in client. I think there are alternatives. One should be merging multiple mutations to one in server side, may be solely for watch events. I guess it might be rare for clients to depend on concrete watch types for changes. But I think this approach might be relatively hard. References: * Etcd rejects multiple mutations for one key in there txns. See [etcd-io/etcd#4363|https://github.com/etcd-io/etcd/pull/4363] and [etcd-io/etcd#4376|https://github.com/etcd-io/etcd/pull/4376] * ZOOKEEPER-4655([#1950|https://github.com/apache/zookeeper/pull/1950]) proposed {{WatchedEvent.zxid}} to carry the {{zxid}} triggering the delivering event. I think this issue undermine that proposal. * The discovery process and reason I open this issue. [#1950|https://github.com/apache/zookeeper/pull/1950#issuecomment-1544436457] -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (ZOOKEEPER-4680) OpCode.check is treated as a quorum write operation white it is not
Kezhu Wang created ZOOKEEPER-4680: - Summary: OpCode.check is treated as a quorum write operation white it is not Key: ZOOKEEPER-4680 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4680 Project: ZooKeeper Issue Type: Bug Components: server Affects Versions: 3.8.1, 3.7.2 Reporter: Kezhu Wang Assignee: Kezhu Wang -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (ZOOKEEPER-4670) Export ZooKeeper server from QuorumPeerMain, ZooKeeperServerMain and ZooKeeperServerEmbedded
Kezhu Wang created ZOOKEEPER-4670: - Summary: Export ZooKeeper server from QuorumPeerMain, ZooKeeperServerMain and ZooKeeperServerEmbedded Key: ZOOKEEPER-4670 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4670 Project: ZooKeeper Issue Type: Improvement Reporter: Kezhu Wang This allows clients to do inspections without reflections. See also CURATOR-535 and https://github.com/apache/curator/pull/421. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (ZOOKEEPER-4667) DataTree.processTxn ignores `OpCode.create2` in `OpCode.multi`
Kezhu Wang created ZOOKEEPER-4667: - Summary: DataTree.processTxn ignores `OpCode.create2` in `OpCode.multi` Key: ZOOKEEPER-4667 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4667 Project: ZooKeeper Issue Type: Bug Components: server Affects Versions: 3.7.1, 3.8.0, 3.9.0 Reporter: Kezhu Wang Official Java client does not use {{{}OpCode.create2{}}}, but I think ZooKeeper is intending to support {{OpCode.create2}} nested in {{OpCode.multi}} in server side as {{FinalRequestProcessor}} handled it. See also ZOOKEEPER-1297. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (ZOOKEEPER-4625) No reliable way to remove watch without interfering others on same paths
Kezhu Wang created ZOOKEEPER-4625: - Summary: No reliable way to remove watch without interfering others on same paths Key: ZOOKEEPER-4625 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4625 Project: ZooKeeper Issue Type: Bug Components: java client Affects Versions: 3.8.0 Reporter: Kezhu Wang It is possible that one node path could be watched more than once by different watchers reusing same ZooKeeper session. ZOOKEEPER-1910 reported this, but resorted to "checkWatches" to circumvent this. I think it might be possible do some tracking works in client to support "removeWatches" without fearing client usages. Here are some links that lead to this issue: * CURATOR-654: DistributedBarrier watcher leak and its [pr|https://github.com/apache/curator/pull/435] * [Why removeWatches sends OpCode.checkWatches to the server?|https://lists.apache.org/thread/0kcnklcxs0s5656c1sbh3crgdodbb0qg] from mailing list. * [Drop for watcher|https://github.com/kezhuw/zookeeper-client-rust/issues/2] from an rust implementation. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (ZOOKEEPER-4601) Define getConfig Watcher behavior in chroot ZooKeeper
Kezhu Wang created ZOOKEEPER-4601: - Summary: Define getConfig Watcher behavior in chroot ZooKeeper Key: ZOOKEEPER-4601 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4601 Project: ZooKeeper Issue Type: Improvement Components: java client Reporter: Kezhu Wang After ZOOKEEPER-4565, {{getConfig}} watcher will receive path for zookeeper config node "/zookeeper/config". But if the path {{getConfig}} watcher received is somewhat sensitive to chroot path. * With chroot path "/zookeeper", {{getConfig}} will receive path "/config". * With other chroot paths, {{getConfig}} will receive path "/zookeeper/config". I think we should define what path {{getConfig}} will get formally to avoid unnoticed behavior. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (ZOOKEEPER-4569) Xid out of caused by early return from eager acl check
Kezhu Wang created ZOOKEEPER-4569: - Summary: Xid out of caused by early return from eager acl check Key: ZOOKEEPER-4569 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4569 Project: ZooKeeper Issue Type: Bug Components: server Affects Versions: 3.7.1, 3.8.0, 3.6.3 Reporter: Kezhu Wang {{ClientCnxn}} [enfores|https://github.com/apache/zookeeper/blob/de7c5869d372e46af43979134d0e30b49d2319b1/zookeeper-server/src/main/java/org/apache/zookeeper/ClientCnxn.java#L917] ordered replies from server. This assumes that requests on server are go through same process pipeline without early return. ZOOKEEPER-3418 breaks this assumption. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (ZOOKEEPER-4548) Negative ZooKeeperServer#getInProcess
Kezhu Wang created ZOOKEEPER-4548: - Summary: Negative ZooKeeperServer#getInProcess Key: ZOOKEEPER-4548 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4548 Project: ZooKeeper Issue Type: Bug Components: server Reporter: Kezhu Wang {{ZooKeeperServer#submitRequestNow}} passes request to {{firstProcessor.processRequest}} before {{incInProcess}}. This could make {{getInProcess}} negative after asynchronous {{decInProcess}} if there is long pause between {{firstProcessor.processRequest}} and {{incInProcess}} due to potential heavy loading. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Created] (ZOOKEEPER-4512) Flaky test: QuorumPeerMainTest#testLeaderOutOfView
Kezhu Wang created ZOOKEEPER-4512: - Summary: Flaky test: QuorumPeerMainTest#testLeaderOutOfView Key: ZOOKEEPER-4512 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4512 Project: ZooKeeper Issue Type: Bug Components: tests Affects Versions: 3.8.0, 3.7.1 Reporter: Kezhu Wang I saw two assertion failures. {code:none} org.opentest4j.AssertionFailedError: Corrupt peer should join quorum with servers having same server configuration ==> expected: but was: at org.apache.zookeeper.server.quorum.QuorumPeerMainTest.testLeaderOutOfView(QuorumPeerMainTest.java:904) {code} * https://github.com/apache/zookeeper/runs/5770639335?check_suite_focus=true * https://github.com/apache/zookeeper/runs/5622644945?check_suite_focus=true {code:none} org.opentest4j.AssertionFailedError: expected: but was: at org.apache.zookeeper.server.quorum.QuorumPeerMainTest.testLeaderOutOfView(QuorumPeerMainTest.java:881) {code} * https://github.com/apache/zookeeper/runs/5759448665?check_suite_focus=true * https://ci-hadoop.apache.org/blue/organizations/jenkins/zookeeper-precommit-github-pr/detail/PR-1852/1/pipeline/#step-35-log-741 -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (ZOOKEEPER-4511) Flaky test: FileTxnSnapLogMetricsTest.testFileTxnSnapLogMetrics
Kezhu Wang created ZOOKEEPER-4511: - Summary: Flaky test: FileTxnSnapLogMetricsTest.testFileTxnSnapLogMetrics Key: ZOOKEEPER-4511 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4511 Project: ZooKeeper Issue Type: Bug Components: tests Reporter: Kezhu Wang * https://github.com/kezhuw/zookeeper/runs/5830250287?check_suite_focus=true * https://github.com/apache/zookeeper/runs/5759834147?check_suite_focus=true {code:none} org.opentest4j.AssertionFailedError: expected: <1> but was: <0> at org.apache.zookeeper.server.persistence.FileTxnSnapLogMetricsTest.testFileTxnSnapLogMetrics(FileTxnSnapLogMetricsTest.java:86) {code} This test tries to write some txns to trigger snapshot with remaning txns in txn log. But snapshot taking is asynchronous, thus all txns could be written to snapshot. So in restarting, it is possible that no txns to load after snapshot restored. This will fail assertion. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (ZOOKEEPER-4508) ZooKeeper client run to endless loop in ClientCnxn.SendThread.run if all server down
Kezhu Wang created ZOOKEEPER-4508: - Summary: ZooKeeper client run to endless loop in ClientCnxn.SendThread.run if all server down Key: ZOOKEEPER-4508 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4508 Project: ZooKeeper Issue Type: Bug Components: java client Affects Versions: 3.8.0, 3.7.0, 3.6.3 Reporter: Kezhu Wang The observable behavior is that client will not get expired event from watcher. The cause if twofold: 1. `updateLastSendAndHeard` is called in reconnection so the session timeout don't decrease. 2. There is not break out from `ClientCnxn.SendThread.run` after session timeout. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (ZOOKEEPER-4475) Persistent recursive watcher got NodeChildrenChanged event
Kezhu Wang created ZOOKEEPER-4475: - Summary: Persistent recursive watcher got NodeChildrenChanged event Key: ZOOKEEPER-4475 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4475 Project: ZooKeeper Issue Type: Bug Reporter: Kezhu Wang Currently, {{NodeChildrenChanged}} events are sent to all persistent watchers unconditionally in client. This requires server to never deliver {{NodeChildrenChanged}} for node's descendants. It can not be true. So we need to filter {{NodeChildrenChanged}} for persistent recursive watchers. Splits from ZOOKEEPER-4466. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (ZOOKEEPER-4474) ZooDefs.opNames is unused
Kezhu Wang created ZOOKEEPER-4474: - Summary: ZooDefs.opNames is unused Key: ZOOKEEPER-4474 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4474 Project: ZooKeeper Issue Type: Improvement Reporter: Kezhu Wang It is public but what is the supposed use case ? Is {{Request.op2String}} sufficient ? It is not sync with {{ZooDefs.OpCode}}, you can't index it by {{OpCode}}. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (ZOOKEEPER-4472) Support persistent watchers removing individually
Kezhu Wang created ZOOKEEPER-4472: - Summary: Support persistent watchers removing individually Key: ZOOKEEPER-4472 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4472 Project: ZooKeeper Issue Type: Improvement Components: server Affects Versions: 3.7.0, 3.6.3 Reporter: Kezhu Wang Persistent watchers could only be removed with {{WatcherType.Any}} now. I think it is meaningful to remove them individually as they are by naming persistent and will not auto removed in server side. Together with proposed solution from [ZOOKEEPER-4466], it will be clear that ZooKeeper has four kind of watchers: # Standard data watcher(which includes data and exist watcher in client side). # Standard child watcher. # Persistent node watcher(aka. data and child watcher for node). # Persistent recursive watcher(aka. data watcher for node and its descendants). See also [ZOOKEEPER-4471] -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (ZOOKEEPER-4471) Remove WatcherType.Children break persistent watcher's child events
Kezhu Wang created ZOOKEEPER-4471: - Summary: Remove WatcherType.Children break persistent watcher's child events Key: ZOOKEEPER-4471 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4471 Project: ZooKeeper Issue Type: Bug Components: server Affects Versions: 3.7.0, 3.6.3 Reporter: Kezhu Wang {{AddWatchMode.PERSISTENT}} was divided as data and child watch in server side. When remove {{WatcherType.Children}}, child part of {{AddWatchMode.PERSISTENT}} is removed but not its data part. This could introduce trick usage of persistent data watch while there is no official api for this. It is better forbid this by dedicate {{WatcherType.Children}} to standard child watch only. I [commits|https://github.com/kezhuw/zookeeper/commit/f7a996646074114830bdc2361e8ff679d08c00bc] a modified {{RemoveWatchesTest.testRemoveAllChildWatchesOnAPath}} in my local repo to reproduce this. I think it is better to support {{removeWatches}} for two persistent watchers too. But it might be a separate issue. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (ZOOKEEPER-4467) Missing op code (addWatch) in Request.op2String
Kezhu Wang created ZOOKEEPER-4467: - Summary: Missing op code (addWatch) in Request.op2String Key: ZOOKEEPER-4467 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4467 Project: ZooKeeper Issue Type: Improvement Components: server Reporter: Kezhu Wang {noformat} Processing request:: sessionid:0x100095a44b2 type:unknown 106 cxid:0x7 zxid:0xfffe txntype:unknown reqpath:/abc {noformat} -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (ZOOKEEPER-4466) Watchers of different modes interfere on overlapping pathes
Kezhu Wang created ZOOKEEPER-4466: - Summary: Watchers of different modes interfere on overlapping pathes Key: ZOOKEEPER-4466 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4466 Project: ZooKeeper Issue Type: Bug Components: java client, server Affects Versions: 3.6.3, 3.7, 3.6.4 Reporter: Kezhu Wang I used to think watchers of different modes are orthogonal. I found there are not, when I wrote tests for unfinished rust client. And I wrote [test cases|https://github.com/kezhuw/zookeeper/commit/79b05a95d2669a4acd16a4d544f24e2083a264f2#diff-8d31d27ea951fbc1f4fbda48d45748318f7124502839d825b77ad3fb8551bf43L152] in java and confirmed. I copied test case here for evaluation. You also clone from [my fork|https://github.com/kezhuw/zookeeper/tree/watch-overlapping-path-with-different-modes-test-case]. {code:java} // zookeeper-server/src/test/java/org/apache/zookeeper/test/PersistentRecursiveWatcherTest.java @Test public void testPathOverlapWithStandardWatcher() throws Exception { try (ZooKeeper zk = createClient(new CountdownWatcher(), hostPort)) { CountDownLatch nodeCreated = new CountDownLatch(1); zk.addWatch("/a", persistentWatcher, PERSISTENT_RECURSIVE); zk.exists("/a", event -> nodeCreated.countDown()); zk.create("/a", new byte[0], ZooDefs.Ids.OPEN_ACL_UNSAFE, CreateMode.PERSISTENT); zk.create("/a/b", new byte[0], ZooDefs.Ids.OPEN_ACL_UNSAFE, CreateMode.PERSISTENT); zk.delete("/a/b", -1); zk.delete("/a", -1); assertEvent(events, Watcher.Event.EventType.NodeCreated, "/a"); assertEvent(events, Watcher.Event.EventType.NodeCreated, "/a/b"); assertEvent(events, Watcher.Event.EventType.NodeDeleted, "/a/b"); assertEvent(events, Watcher.Event.EventType.NodeDeleted, "/a"); assertTrue(nodeCreated.await(5, TimeUnit.SECONDS)); } } @Test public void testPathOverlapWithPersistentWatcher() throws Exception { try (ZooKeeper zk = createClient(new CountdownWatcher(), hostPort)) { zk.addWatch("/a", persistentWatcher, PERSISTENT_RECURSIVE); zk.addWatch("/a/b", event -> {}, PERSISTENT); zk.create("/a", new byte[0], ZooDefs.Ids.OPEN_ACL_UNSAFE, CreateMode.PERSISTENT); zk.create("/a/b", new byte[0], ZooDefs.Ids.OPEN_ACL_UNSAFE, CreateMode.PERSISTENT); zk.create("/a/b/c", new byte[0], ZooDefs.Ids.OPEN_ACL_UNSAFE, CreateMode.PERSISTENT); zk.delete("/a/b/c", -1); zk.delete("/a/b", -1); zk.delete("/a", -1); assertEvent(events, Watcher.Event.EventType.NodeCreated, "/a"); assertEvent(events, Watcher.Event.EventType.NodeCreated, "/a/b"); assertEvent(events, Watcher.Event.EventType.NodeCreated, "/a/b/c"); assertEvent(events, Watcher.Event.EventType.NodeDeleted, "/a/b/c"); assertEvent(events, Watcher.Event.EventType.NodeDeleted, "/a/b"); assertEvent(events, Watcher.Event.EventType.NodeDeleted, "/a"); } } {code} I skimmed the code and found two possible causes: # {{ZKWatchManager.materialize}} materializes all persistent watchers(include recursive ones) for {{NodeChildrenChanged}} event. # {{WatcherModeManager}} trackes only one watcher mode. -- This message was sent by Atlassian Jira (v8.20.1#820001)