[GitHub] zookeeper issue #563: ZOOKEEPER-3072: Throttle race condition fix
Github user bothejjms commented on the issue: https://github.com/apache/zookeeper/pull/563 I have refactored the branches as suggested. ---
[GitHub] zookeeper pull request #563: ZOOKEEPER-3072: Throttle race condition fix
Github user bothejjms commented on a diff in the pull request: https://github.com/apache/zookeeper/pull/563#discussion_r205778339 --- Diff: src/java/main/org/apache/zookeeper/server/ZooKeeperServer.java --- @@ -1128,9 +1128,9 @@ public void processPacket(ServerCnxn cnxn, ByteBuffer incomingBuffer) throws IOE Record rsp = processSasl(incomingBuffer,cnxn); ReplyHeader rh = new ReplyHeader(h.getXid(), 0, KeeperException.Code.OK.intValue()); cnxn.sendResponse(rh,rsp, "response"); // not sure about 3rd arg..what is it? -return; --- End diff -- I have refactored like that. Returns are actually unnecessary but I have consistently added them now. ---
[GitHub] zookeeper issue #563: ZOOKEEPER-3072: Throttle race condition fix
Github user bothejjms commented on the issue: https://github.com/apache/zookeeper/pull/563 I have removed the test for now as I don't have a good way to test this race condition. I can be reproduced easily by starting a server where the globalOutstandingLimit is 1 and sending a lot exists requests. There is a good chance that one session will stuck in a throttled state despite it has no active requests. ---
[GitHub] zookeeper issue #563: Fix for ZOOKEEPER-3072
Github user bothejjms commented on the issue: https://github.com/apache/zookeeper/pull/563 I have tweaked the test to use significantly less threads and be faster. Unfortunately it still fails on jenkins. :( I am not sure how ThreadPoolExecutor would help with this. It will spin up the same amount of threads in background, isn't it? ---
[GitHub] zookeeper issue #563: Fix for ZOOKEEPER-3072
Github user bothejjms commented on the issue: https://github.com/apache/zookeeper/pull/563 On "pretty reliably" I mean the test has failed for me like 90% of the time with the original code but the result can differ on different machines since it is a race condition. Reproducing race condition in a test is not simple. I am open to suggestions how to do it reliably. Do you recall any other tests for race conditions in the test suite? After the fix the test passed on my machine always. I am not sure yet why it fails on jenkins. For me the test takes 40 sec on my VM which is not particularly strong. I am also not satisfied with this test. I just wanted to prove that the race condition is there. Instead of the test I could add a description on how to reproduce and skip permanent testing for it. ---
[GitHub] zookeeper pull request #563: Fix for ZOOKEEPER-3072
Github user bothejjms commented on a diff in the pull request: https://github.com/apache/zookeeper/pull/563#discussion_r200961680 --- Diff: src/java/main/org/apache/zookeeper/server/ZooKeeperServer.java --- @@ -1124,6 +1124,7 @@ public void processPacket(ServerCnxn cnxn, ByteBuffer incomingBuffer) throws IOE } return; } else { +cnxn.incrOutstandingRequests(h); --- End diff -- hmm right. That return was not there in 3.5.3 where I have spotted the issue. I have missed it when I have moved my change to master. I see ZOOKEEPER-2785 introduced it. I will update my pr and move incr to the else branch to avoid sasl throttling. ---
[GitHub] zookeeper pull request #563: Fix for ZOOKEEPER-3072
GitHub user bothejjms opened a pull request: https://github.com/apache/zookeeper/pull/563 Fix for ZOOKEEPER-3072 Making the throttle check before passing over the request to the next thread will prevent the possibility of throttling code running after unthrottle Added an additional async hammer thread which is pretty reliably reproduces the race condition. The globalOutstandingLimit is decreased so throttling code is executed. You can merge this pull request into a Git repository by running: $ git pull https://github.com/bothejjms/zookeeper ZOOKEEPER-3072 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/zookeeper/pull/563.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #563 commit 0f19de8f8b95992d4e95e3ffa46208c1b12b8cb0 Author: Botond Hejj Date: 2018-07-06T14:43:18Z Fix for ZOOKEEPER-3072 Making the throttle check before passing over the request to the next thread will prevent the possibility of throttling code running after unthrottle ---