[jira] [Comment Edited] (ZOOKEEPER-1549) Data inconsistency when follower is receiving a DIFF with a dirty snapshot
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13474893#comment-13474893 ] Flavio Junqueira edited comment on ZOOKEEPER-1549 at 10/12/12 8:44 AM: --- I don't think major changes are needed, at least for the leader case. We simply shouldn't be taking snapshots over uncommitted state. Check ZOOKEEPER-1558 and ZOOKEEPER-1559, subtasks of this jira. was (Author: fpj): I don't think major changes are needed, at least for the leader case. We simply shouldn't be taking snapshots over uncommitted state. Check ZOOKEEPER-1558 and ZOOKEEPER-1559, a subtask of this jira. Data inconsistency when follower is receiving a DIFF with a dirty snapshot -- Key: ZOOKEEPER-1549 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1549 Project: ZooKeeper Issue Type: Bug Components: quorum Affects Versions: 3.4.3 Reporter: Jacky007 Priority: Critical Attachments: case.patch the trunc code (from ZOOKEEPER-1154?) cannot work correct if the snapshot is not correct. here is scenario(similar to 1154): Initial Condition 1.Lets say there are three nodes in the ensemble A,B,C with A being the leader 2.The current epoch is 7. 3.For simplicity of the example, lets say zxid is a two digit number, with epoch being the first digit. 4.The zxid is 73 5.All the nodes have seen the change 73 and have persistently logged it. Step 1 Request with zxid 74 is issued. The leader A writes it to the log but there is a crash of the entire ensemble and B,C never write the change 74 to their log. Step 2 A,B restart, A is elected as the new leader, and A will load data and take a clean snapshot(change 74 is in it), then send diff to B, but B died before sync with A. A died later. Step 3 B,C restart, A is still down B,C form the quorum B is the new leader. Lets say B minCommitLog is 71 and maxCommitLog is 73 epoch is now 8, zxid is 80 Request with zxid 81 is successful. On B, minCommitLog is now 71, maxCommitLog is 81 Step 4 A starts up. It applies the change in request with zxid 74 to its in-memory data tree A contacts B to registerAsFollower and provides 74 as its ZxId Since 71=74=81, B decides to send A the diff. Problem: The problem with the above sequence is that after truncate the log, A will load the snapshot again which is not correct. In 3.3 branch, FileTxnSnapLog.restore does not call listener(ZOOKEEPER-874), the leader will send a snapshot to follower, it will not be a problem. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (ZOOKEEPER-1549) Data inconsistency when follower is receiving a DIFF with a dirty snapshot
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13474893#comment-13474893 ] Flavio Junqueira commented on ZOOKEEPER-1549: - I don't think major changes are needed, at least for the leader case. We simply shouldn't be taking snapshots over uncommitted state. Check ZOOKEEPER-1558 and ZOOKEEPER-1559, a subtask of this jira. Data inconsistency when follower is receiving a DIFF with a dirty snapshot -- Key: ZOOKEEPER-1549 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1549 Project: ZooKeeper Issue Type: Bug Components: quorum Affects Versions: 3.4.3 Reporter: Jacky007 Priority: Critical Attachments: case.patch the trunc code (from ZOOKEEPER-1154?) cannot work correct if the snapshot is not correct. here is scenario(similar to 1154): Initial Condition 1.Lets say there are three nodes in the ensemble A,B,C with A being the leader 2.The current epoch is 7. 3.For simplicity of the example, lets say zxid is a two digit number, with epoch being the first digit. 4.The zxid is 73 5.All the nodes have seen the change 73 and have persistently logged it. Step 1 Request with zxid 74 is issued. The leader A writes it to the log but there is a crash of the entire ensemble and B,C never write the change 74 to their log. Step 2 A,B restart, A is elected as the new leader, and A will load data and take a clean snapshot(change 74 is in it), then send diff to B, but B died before sync with A. A died later. Step 3 B,C restart, A is still down B,C form the quorum B is the new leader. Lets say B minCommitLog is 71 and maxCommitLog is 73 epoch is now 8, zxid is 80 Request with zxid 81 is successful. On B, minCommitLog is now 71, maxCommitLog is 81 Step 4 A starts up. It applies the change in request with zxid 74 to its in-memory data tree A contacts B to registerAsFollower and provides 74 as its ZxId Since 71=74=81, B decides to send A the diff. Problem: The problem with the above sequence is that after truncate the log, A will load the snapshot again which is not correct. In 3.3 branch, FileTxnSnapLog.restore does not call listener(ZOOKEEPER-874), the leader will send a snapshot to follower, it will not be a problem. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (ZOOKEEPER-1560) Zookeeper client hangs on creation of large nodes
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13474981#comment-13474981 ] Jacky007 commented on ZOOKEEPER-1560: - I think this would work for both 1560 and 1561. {noformat} if (p != null) { updateLastSend(); if ((p.requestHeader != null) (p.requestHeader.getType() != OpCode.ping) (p.requestHeader.getType() != OpCode.auth)) { p.requestHeader.setXid(cnxn.getXid()); } p.createBB(); ByteBuffer pbb = p.bb; --- while (pbb.hasRemaining()) sock.write(pbb); --- outgoingQueue.removeFirstOccurrence(p); sentCount++; if (p.requestHeader != null p.requestHeader.getType() != OpCode.ping p.requestHeader.getType() != OpCode.auth) { pending.add(p); } } {noformat} Zookeeper client hangs on creation of large nodes - Key: ZOOKEEPER-1560 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1560 Project: ZooKeeper Issue Type: Bug Components: java client Affects Versions: 3.4.4, 3.5.0 Reporter: Igor Motov Assignee: Ted Yu Fix For: 3.5.0, 3.4.5 Attachments: ZOOKEEPER-1560.patch, zookeeper-1560-v1.txt, zookeeper-1560-v2.txt, zookeeper-1560-v3.txt To reproduce, try creating a node with 0.5M of data using java client. The test will hang waiting for a response from the server. See the attached patch for the test that reproduces the issue. It seems that ZOOKEEPER-1437 introduced a few issues to {{ClientCnxnSocketNIO.doIO}} that prevent {{ClientCnxnSocketNIO}} from sending large packets that require several invocations of {{SocketChannel.write}} to complete. The first issue is that the call to {{outgoingQueue.removeFirstOccurrence(p);}} removes the packet from the queue even if the packet wasn't completely sent yet. It looks to me that this call should be moved under {{if (!pbb.hasRemaining())}} The second issue is that {{p.createBB()}} is reinitializing {{ByteBuffer}} on every iteration, which confuses {{SocketChannel.write}}. And the third issue is caused by extra calls to {{cnxn.getXid()}} that increment xid on every iteration and confuse the server. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Failed: ZOOKEEPER-1560 PreCommit Build #1215
Jira: https://issues.apache.org/jira/browse/ZOOKEEPER-1560 Build: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1215/ ### ## LAST 60 LINES OF THE CONSOLE ### [...truncated 172814 lines...] [exec] [exec] [exec] [exec] -1 overall. Here are the results of testing the latest attachment [exec] http://issues.apache.org/jira/secure/attachment/12548889/zookeeper-1560-v4.txt [exec] against trunk revision 1391526. [exec] [exec] +1 @author. The patch does not contain any @author tags. [exec] [exec] +1 tests included. The patch appears to include 3 new or modified tests. [exec] [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] [exec] +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. [exec] [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings. [exec] [exec] -1 core tests. The patch failed core unit tests. [exec] [exec] +1 contrib tests. The patch passed contrib unit tests. [exec] [exec] Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1215//testReport/ [exec] Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1215//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html [exec] Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1215//console [exec] [exec] This message is automatically generated. [exec] [exec] [exec] == [exec] == [exec] Adding comment to Jira. [exec] == [exec] == [exec] [exec] [exec] Comment added. [exec] b22x2sL361 logged out [exec] [exec] [exec] == [exec] == [exec] Finished build. [exec] == [exec] == [exec] [exec] BUILD FAILED /home/jenkins/jenkins-slave/workspace/PreCommit-ZOOKEEPER-Build/trunk/build.xml:1568: exec returned: 1 Total time: 24 minutes 12 seconds Build step 'Execute shell' marked build as failure Archiving artifacts Recording test results Description set: ZOOKEEPER-1560 Email was triggered for: Failure Sending email for trigger: Failure ### ## FAILED TESTS (if any) ## 2 tests failed. FAILED: org.apache.zookeeper.test.ChrootClientTest.testLargeNodeData Error Message: KeeperErrorCode = ConnectionLoss for /large Stack Trace: org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /large at org.apache.zookeeper.KeeperException.create(KeeperException.java:99) at org.apache.zookeeper.KeeperException.create(KeeperException.java:51) at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:783) at org.apache.zookeeper.test.ClientTest.testLargeNodeData(ClientTest.java:61) at org.apache.zookeeper.JUnit4ZKTestRunner$LoggedInvokeMethod.evaluate(JUnit4ZKTestRunner.java:52) FAILED: org.apache.zookeeper.test.ClientTest.testLargeNodeData Error Message: KeeperErrorCode = ConnectionLoss for /large Stack Trace: org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /large at org.apache.zookeeper.KeeperException.create(KeeperException.java:99) at org.apache.zookeeper.KeeperException.create(KeeperException.java:51) at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:783) at org.apache.zookeeper.test.ClientTest.testLargeNodeData(ClientTest.java:61) at org.apache.zookeeper.JUnit4ZKTestRunner$LoggedInvokeMethod.evaluate(JUnit4ZKTestRunner.java:52)
[jira] [Commented] (ZOOKEEPER-1560) Zookeeper client hangs on creation of large nodes
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13475014#comment-13475014 ] Hadoop QA commented on ZOOKEEPER-1560: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12548889/zookeeper-1560-v4.txt against trunk revision 1391526. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1215//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1215//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1215//console This message is automatically generated. Zookeeper client hangs on creation of large nodes - Key: ZOOKEEPER-1560 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1560 Project: ZooKeeper Issue Type: Bug Components: java client Affects Versions: 3.4.4, 3.5.0 Reporter: Igor Motov Assignee: Ted Yu Fix For: 3.5.0, 3.4.5 Attachments: ZOOKEEPER-1560.patch, zookeeper-1560-v1.txt, zookeeper-1560-v2.txt, zookeeper-1560-v3.txt, zookeeper-1560-v4.txt To reproduce, try creating a node with 0.5M of data using java client. The test will hang waiting for a response from the server. See the attached patch for the test that reproduces the issue. It seems that ZOOKEEPER-1437 introduced a few issues to {{ClientCnxnSocketNIO.doIO}} that prevent {{ClientCnxnSocketNIO}} from sending large packets that require several invocations of {{SocketChannel.write}} to complete. The first issue is that the call to {{outgoingQueue.removeFirstOccurrence(p);}} removes the packet from the queue even if the packet wasn't completely sent yet. It looks to me that this call should be moved under {{if (!pbb.hasRemaining())}} The second issue is that {{p.createBB()}} is reinitializing {{ByteBuffer}} on every iteration, which confuses {{SocketChannel.write}}. And the third issue is caused by extra calls to {{cnxn.getXid()}} that increment xid on every iteration and confuse the server. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Jenkins build is back to stable : bookkeeper-trunk » bookkeeper-server #750
See https://builds.apache.org/job/bookkeeper-trunk/org.apache.bookkeeper$bookkeeper-server/750/
Jenkins build is still unstable: bookkeeper-trunk » hedwig-server #750
See https://builds.apache.org/job/bookkeeper-trunk/org.apache.bookkeeper$hedwig-server/750/
[jira] [Updated] (ZOOKEEPER-1560) Zookeeper client hangs on creation of large nodes
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1560?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated ZOOKEEPER-1560: -- Attachment: zookeeper-1560-v5.txt From https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1215//testReport/org.apache.zookeeper.test/ClientTest/testLargeNodeData/ : {code} 2012-10-12 14:10:50,042 [myid:] - WARN [main-SendThread(localhost:11221):ClientCnxn$SendThread@1089] - Session 0x13a555031cf for server localhost/127.0.0.1:11221, unexpected error, closing socket connection and attempting reconnect java.io.IOException: Couldn't write 2000 bytes, 1152 bytes written at org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:142) at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:370) at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1068) 2012-10-12 14:10:50,044 [myid:] - WARN [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:11221:NIOServerCnxn@349] - caught end of stream exception EndOfStreamException: Unable to read additional data from client sessionid 0x13a555031cf, likely client has closed socket at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:220) at org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208) at java.lang.Thread.run(Thread.java:662) {code} Patch v5 adds more information to exception message. Zookeeper client hangs on creation of large nodes - Key: ZOOKEEPER-1560 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1560 Project: ZooKeeper Issue Type: Bug Components: java client Affects Versions: 3.4.4, 3.5.0 Reporter: Igor Motov Assignee: Ted Yu Fix For: 3.5.0, 3.4.5 Attachments: ZOOKEEPER-1560.patch, zookeeper-1560-v1.txt, zookeeper-1560-v2.txt, zookeeper-1560-v3.txt, zookeeper-1560-v4.txt, zookeeper-1560-v5.txt To reproduce, try creating a node with 0.5M of data using java client. The test will hang waiting for a response from the server. See the attached patch for the test that reproduces the issue. It seems that ZOOKEEPER-1437 introduced a few issues to {{ClientCnxnSocketNIO.doIO}} that prevent {{ClientCnxnSocketNIO}} from sending large packets that require several invocations of {{SocketChannel.write}} to complete. The first issue is that the call to {{outgoingQueue.removeFirstOccurrence(p);}} removes the packet from the queue even if the packet wasn't completely sent yet. It looks to me that this call should be moved under {{if (!pbb.hasRemaining())}} The second issue is that {{p.createBB()}} is reinitializing {{ByteBuffer}} on every iteration, which confuses {{SocketChannel.write}}. And the third issue is caused by extra calls to {{cnxn.getXid()}} that increment xid on every iteration and confuse the server. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (ZOOKEEPER-1560) Zookeeper client hangs on creation of large nodes
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1560?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated ZOOKEEPER-1560: -- Attachment: zookeeper-1560-v6.txt Patch v6 changes the condition for raising IOE: if there is no progress between successive sock.write() calls. I guess socket's output buffer might be a limiting factor as to the number of bytes written in a particular sock.write() call. Zookeeper client hangs on creation of large nodes - Key: ZOOKEEPER-1560 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1560 Project: ZooKeeper Issue Type: Bug Components: java client Affects Versions: 3.4.4, 3.5.0 Reporter: Igor Motov Assignee: Ted Yu Fix For: 3.5.0, 3.4.5 Attachments: ZOOKEEPER-1560.patch, zookeeper-1560-v1.txt, zookeeper-1560-v2.txt, zookeeper-1560-v3.txt, zookeeper-1560-v4.txt, zookeeper-1560-v5.txt, zookeeper-1560-v6.txt To reproduce, try creating a node with 0.5M of data using java client. The test will hang waiting for a response from the server. See the attached patch for the test that reproduces the issue. It seems that ZOOKEEPER-1437 introduced a few issues to {{ClientCnxnSocketNIO.doIO}} that prevent {{ClientCnxnSocketNIO}} from sending large packets that require several invocations of {{SocketChannel.write}} to complete. The first issue is that the call to {{outgoingQueue.removeFirstOccurrence(p);}} removes the packet from the queue even if the packet wasn't completely sent yet. It looks to me that this call should be moved under {{if (!pbb.hasRemaining())}} The second issue is that {{p.createBB()}} is reinitializing {{ByteBuffer}} on every iteration, which confuses {{SocketChannel.write}}. And the third issue is caused by extra calls to {{cnxn.getXid()}} that increment xid on every iteration and confuse the server. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Failed: ZOOKEEPER-1560 PreCommit Build #1216
Jira: https://issues.apache.org/jira/browse/ZOOKEEPER-1560 Build: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1216/ ### ## LAST 60 LINES OF THE CONSOLE ### [...truncated 169973 lines...] [exec] [exec] [exec] [exec] -1 overall. Here are the results of testing the latest attachment [exec] http://issues.apache.org/jira/secure/attachment/12548893/zookeeper-1560-v5.txt [exec] against trunk revision 1391526. [exec] [exec] +1 @author. The patch does not contain any @author tags. [exec] [exec] +1 tests included. The patch appears to include 3 new or modified tests. [exec] [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] [exec] +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. [exec] [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings. [exec] [exec] -1 core tests. The patch failed core unit tests. [exec] [exec] +1 contrib tests. The patch passed contrib unit tests. [exec] [exec] Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1216//testReport/ [exec] Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1216//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html [exec] Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1216//console [exec] [exec] This message is automatically generated. [exec] [exec] [exec] == [exec] == [exec] Adding comment to Jira. [exec] == [exec] == [exec] [exec] [exec] Comment added. [exec] vuJG8poe1s logged out [exec] [exec] [exec] == [exec] == [exec] Finished build. [exec] == [exec] == [exec] [exec] BUILD FAILED /home/jenkins/jenkins-slave/workspace/PreCommit-ZOOKEEPER-Build/trunk/build.xml:1568: exec returned: 1 Total time: 23 minutes 57 seconds Build step 'Execute shell' marked build as failure Archiving artifacts Recording test results Description set: ZOOKEEPER-1560 Email was triggered for: Failure Sending email for trigger: Failure ### ## FAILED TESTS (if any) ## 2 tests failed. FAILED: org.apache.zookeeper.test.ChrootClientTest.testLargeNodeData Error Message: KeeperErrorCode = ConnectionLoss for /large Stack Trace: org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /large at org.apache.zookeeper.KeeperException.create(KeeperException.java:99) at org.apache.zookeeper.KeeperException.create(KeeperException.java:51) at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:783) at org.apache.zookeeper.test.ClientTest.testLargeNodeData(ClientTest.java:61) at org.apache.zookeeper.JUnit4ZKTestRunner$LoggedInvokeMethod.evaluate(JUnit4ZKTestRunner.java:52) FAILED: org.apache.zookeeper.test.ClientTest.testLargeNodeData Error Message: KeeperErrorCode = ConnectionLoss for /large Stack Trace: org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /large at org.apache.zookeeper.KeeperException.create(KeeperException.java:99) at org.apache.zookeeper.KeeperException.create(KeeperException.java:51) at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:783) at org.apache.zookeeper.test.ClientTest.testLargeNodeData(ClientTest.java:61) at org.apache.zookeeper.JUnit4ZKTestRunner$LoggedInvokeMethod.evaluate(JUnit4ZKTestRunner.java:52)
[jira] [Commented] (ZOOKEEPER-1560) Zookeeper client hangs on creation of large nodes
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13475062#comment-13475062 ] Hadoop QA commented on ZOOKEEPER-1560: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12548893/zookeeper-1560-v5.txt against trunk revision 1391526. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1216//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1216//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1216//console This message is automatically generated. Zookeeper client hangs on creation of large nodes - Key: ZOOKEEPER-1560 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1560 Project: ZooKeeper Issue Type: Bug Components: java client Affects Versions: 3.4.4, 3.5.0 Reporter: Igor Motov Assignee: Ted Yu Fix For: 3.5.0, 3.4.5 Attachments: ZOOKEEPER-1560.patch, zookeeper-1560-v1.txt, zookeeper-1560-v2.txt, zookeeper-1560-v3.txt, zookeeper-1560-v4.txt, zookeeper-1560-v5.txt, zookeeper-1560-v6.txt To reproduce, try creating a node with 0.5M of data using java client. The test will hang waiting for a response from the server. See the attached patch for the test that reproduces the issue. It seems that ZOOKEEPER-1437 introduced a few issues to {{ClientCnxnSocketNIO.doIO}} that prevent {{ClientCnxnSocketNIO}} from sending large packets that require several invocations of {{SocketChannel.write}} to complete. The first issue is that the call to {{outgoingQueue.removeFirstOccurrence(p);}} removes the packet from the queue even if the packet wasn't completely sent yet. It looks to me that this call should be moved under {{if (!pbb.hasRemaining())}} The second issue is that {{p.createBB()}} is reinitializing {{ByteBuffer}} on every iteration, which confuses {{SocketChannel.write}}. And the third issue is caused by extra calls to {{cnxn.getXid()}} that increment xid on every iteration and confuse the server. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (BOOKKEEPER-422) Simplify AbstractSubscriptionManager
[ https://issues.apache.org/jira/browse/BOOKKEEPER-422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13475068#comment-13475068 ] Flavio Junqueira commented on BOOKKEEPER-422: - It sounds like a good idea to me to use a SortedMap, Sijie. Do you see a problem with doing it, Stu? It also sounds like a good idea to validate the subscriber id as you point out, Sijie. It should be a separate jira as you suggest. Simplify AbstractSubscriptionManager Key: BOOKKEEPER-422 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-422 Project: Bookkeeper Issue Type: Improvement Components: hedwig-server Reporter: Stu Hood Assignee: Stu Hood Priority: Minor Attachments: bk-422.diff, bk-422.diff, bk-422.diff It's difficult to maintain a duplicated/cached count of local subscribers, and we've experienced a few issues due to it getting out of sync with the actual set of subscribers. Since a count of local subscribers can be calculated from the top2sub2seq map, let's do that instead. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (BOOKKEEPER-430) Remove manual bookie registration from overview
Flavio Junqueira created BOOKKEEPER-430: --- Summary: Remove manual bookie registration from overview Key: BOOKKEEPER-430 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-430 Project: Bookkeeper Issue Type: Improvement Affects Versions: 4.1.0 Reporter: Flavio Junqueira Assignee: Flavio Junqueira The documentation suggests that a user needs to manually register a bookie, which is not right. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Failed: ZOOKEEPER-1560 PreCommit Build #1217
Jira: https://issues.apache.org/jira/browse/ZOOKEEPER-1560 Build: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1217/ ### ## LAST 60 LINES OF THE CONSOLE ### [...truncated 169234 lines...] [exec] [exec] [exec] [exec] -1 overall. Here are the results of testing the latest attachment [exec] http://issues.apache.org/jira/secure/attachment/12548898/zookeeper-1560-v6.txt [exec] against trunk revision 1391526. [exec] [exec] +1 @author. The patch does not contain any @author tags. [exec] [exec] +1 tests included. The patch appears to include 3 new or modified tests. [exec] [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] [exec] +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. [exec] [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings. [exec] [exec] -1 core tests. The patch failed core unit tests. [exec] [exec] +1 contrib tests. The patch passed contrib unit tests. [exec] [exec] Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1217//testReport/ [exec] Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1217//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html [exec] Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1217//console [exec] [exec] This message is automatically generated. [exec] [exec] [exec] == [exec] == [exec] Adding comment to Jira. [exec] == [exec] == [exec] [exec] [exec] Comment added. [exec] l38K6LEVny logged out [exec] [exec] [exec] == [exec] == [exec] Finished build. [exec] == [exec] == [exec] [exec] BUILD FAILED /home/jenkins/jenkins-slave/workspace/PreCommit-ZOOKEEPER-Build/trunk/build.xml:1568: exec returned: 1 Total time: 24 minutes 48 seconds Build step 'Execute shell' marked build as failure Archiving artifacts Recording test results Description set: ZOOKEEPER-1560 Email was triggered for: Failure Sending email for trigger: Failure ### ## FAILED TESTS (if any) ## 3 tests failed. REGRESSION: org.apache.zookeeper.test.LETest.testLE Error Message: Thread 3 got 27 expected 28 Stack Trace: junit.framework.AssertionFailedError: Thread 3 got 27 expected 28 at org.apache.zookeeper.test.LETest.testLE(LETest.java:135) at org.apache.zookeeper.JUnit4ZKTestRunner$LoggedInvokeMethod.evaluate(JUnit4ZKTestRunner.java:52) FAILED: org.apache.zookeeper.test.ChrootClientTest.testLargeNodeData Error Message: KeeperErrorCode = ConnectionLoss for /large Stack Trace: org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /large at org.apache.zookeeper.KeeperException.create(KeeperException.java:99) at org.apache.zookeeper.KeeperException.create(KeeperException.java:51) at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:783) at org.apache.zookeeper.test.ClientTest.testLargeNodeData(ClientTest.java:61) at org.apache.zookeeper.JUnit4ZKTestRunner$LoggedInvokeMethod.evaluate(JUnit4ZKTestRunner.java:52) FAILED: org.apache.zookeeper.test.ClientTest.testLargeNodeData Error Message: KeeperErrorCode = ConnectionLoss for /large Stack Trace: org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /large at org.apache.zookeeper.KeeperException.create(KeeperException.java:99) at org.apache.zookeeper.KeeperException.create(KeeperException.java:51) at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:783) at org.apache.zookeeper.test.ClientTest.testLargeNodeData(ClientTest.java:61) at org.apache.zookeeper.JUnit4ZKTestRunner$LoggedInvokeMethod.evaluate(JUnit4ZKTestRunner.java:52)
[jira] [Commented] (ZOOKEEPER-1560) Zookeeper client hangs on creation of large nodes
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13475079#comment-13475079 ] Hadoop QA commented on ZOOKEEPER-1560: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12548898/zookeeper-1560-v6.txt against trunk revision 1391526. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1217//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1217//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1217//console This message is automatically generated. Zookeeper client hangs on creation of large nodes - Key: ZOOKEEPER-1560 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1560 Project: ZooKeeper Issue Type: Bug Components: java client Affects Versions: 3.4.4, 3.5.0 Reporter: Igor Motov Assignee: Ted Yu Fix For: 3.5.0, 3.4.5 Attachments: ZOOKEEPER-1560.patch, zookeeper-1560-v1.txt, zookeeper-1560-v2.txt, zookeeper-1560-v3.txt, zookeeper-1560-v4.txt, zookeeper-1560-v5.txt, zookeeper-1560-v6.txt To reproduce, try creating a node with 0.5M of data using java client. The test will hang waiting for a response from the server. See the attached patch for the test that reproduces the issue. It seems that ZOOKEEPER-1437 introduced a few issues to {{ClientCnxnSocketNIO.doIO}} that prevent {{ClientCnxnSocketNIO}} from sending large packets that require several invocations of {{SocketChannel.write}} to complete. The first issue is that the call to {{outgoingQueue.removeFirstOccurrence(p);}} removes the packet from the queue even if the packet wasn't completely sent yet. It looks to me that this call should be moved under {{if (!pbb.hasRemaining())}} The second issue is that {{p.createBB()}} is reinitializing {{ByteBuffer}} on every iteration, which confuses {{SocketChannel.write}}. And the third issue is caused by extra calls to {{cnxn.getXid()}} that increment xid on every iteration and confuse the server. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (ZOOKEEPER-1560) Zookeeper client hangs on creation of large nodes
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13475085#comment-13475085 ] Eugene Koontz commented on ZOOKEEPER-1560: -- It seems like in a particular iteration, 0 bytes is written: {code} localhost/127.0.0.1:11222, unexpected error, closing socket connection and attempting reconnect [exec] [junit] java.io.IOException: Couldn't write 2000 bytes, 0 bytes written in this iteration and 77152 bytes written in total. Original limit: 500074 [exec] [junit] at org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:145) [exec] [junit] at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:375) [exec] [junit] at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1068) [exec] [junit] 2012-10-12 15:20:42,629 [myid:] - WARN [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:11222:NIOServerCnxn@349] - caught end of stream exception [exec] [junit] EndOfStreamException: Unable to read additional data from client sessionid 0x13a55902b650001, likely client has closed socket [exec] [junit] at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:220) [exec] [junit] at org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208) [exec] [junit] at java.lang.Thread.run(Thread.java:662) [exec] [junit] 2012-10-12 15:20:42,630 [myid:] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:11222:NIOServerCnxn@1001] - Closed socket connection for client /127.0.0.1:57126 which had sessionid 0x13a55902b650001 {code} Seems like there's a strange resemblance among all the test failures thus far: always fails after 77152 bytes written. Zookeeper client hangs on creation of large nodes - Key: ZOOKEEPER-1560 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1560 Project: ZooKeeper Issue Type: Bug Components: java client Affects Versions: 3.4.4, 3.5.0 Reporter: Igor Motov Assignee: Ted Yu Fix For: 3.5.0, 3.4.5 Attachments: ZOOKEEPER-1560.patch, zookeeper-1560-v1.txt, zookeeper-1560-v2.txt, zookeeper-1560-v3.txt, zookeeper-1560-v4.txt, zookeeper-1560-v5.txt, zookeeper-1560-v6.txt To reproduce, try creating a node with 0.5M of data using java client. The test will hang waiting for a response from the server. See the attached patch for the test that reproduces the issue. It seems that ZOOKEEPER-1437 introduced a few issues to {{ClientCnxnSocketNIO.doIO}} that prevent {{ClientCnxnSocketNIO}} from sending large packets that require several invocations of {{SocketChannel.write}} to complete. The first issue is that the call to {{outgoingQueue.removeFirstOccurrence(p);}} removes the packet from the queue even if the packet wasn't completely sent yet. It looks to me that this call should be moved under {{if (!pbb.hasRemaining())}} The second issue is that {{p.createBB()}} is reinitializing {{ByteBuffer}} on every iteration, which confuses {{SocketChannel.write}}. And the third issue is caused by extra calls to {{cnxn.getXid()}} that increment xid on every iteration and confuse the server. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (BOOKKEEPER-431) Duplicate definition of COOKIES_NODE
Flavio Junqueira created BOOKKEEPER-431: --- Summary: Duplicate definition of COOKIES_NODE Key: BOOKKEEPER-431 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-431 Project: Bookkeeper Issue Type: Improvement Affects Versions: 4.1.0 Reporter: Flavio Junqueira Priority: Minor Fix For: 4.2.0 Is it necessary two definitions of COOKIES_NODE, one in cookie.java and one in AbstractZkLedgerManager? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (ZOOKEEPER-1560) Zookeeper client hangs on creation of large nodes
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1560?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated ZOOKEEPER-1560: -- Attachment: zookeeper-1560-v7.txt Patch v7 changes the IOE to a warning. Let's see if the test is able to make further progress. I wonder whether 77152 bytes would be big enough for most use cases. Zookeeper client hangs on creation of large nodes - Key: ZOOKEEPER-1560 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1560 Project: ZooKeeper Issue Type: Bug Components: java client Affects Versions: 3.4.4, 3.5.0 Reporter: Igor Motov Assignee: Ted Yu Fix For: 3.5.0, 3.4.5 Attachments: ZOOKEEPER-1560.patch, zookeeper-1560-v1.txt, zookeeper-1560-v2.txt, zookeeper-1560-v3.txt, zookeeper-1560-v4.txt, zookeeper-1560-v5.txt, zookeeper-1560-v6.txt, zookeeper-1560-v7.txt To reproduce, try creating a node with 0.5M of data using java client. The test will hang waiting for a response from the server. See the attached patch for the test that reproduces the issue. It seems that ZOOKEEPER-1437 introduced a few issues to {{ClientCnxnSocketNIO.doIO}} that prevent {{ClientCnxnSocketNIO}} from sending large packets that require several invocations of {{SocketChannel.write}} to complete. The first issue is that the call to {{outgoingQueue.removeFirstOccurrence(p);}} removes the packet from the queue even if the packet wasn't completely sent yet. It looks to me that this call should be moved under {{if (!pbb.hasRemaining())}} The second issue is that {{p.createBB()}} is reinitializing {{ByteBuffer}} on every iteration, which confuses {{SocketChannel.write}}. And the third issue is caused by extra calls to {{cnxn.getXid()}} that increment xid on every iteration and confuse the server. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (BOOKKEEPER-431) Duplicate definition of COOKIES_NODE
[ https://issues.apache.org/jira/browse/BOOKKEEPER-431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13475099#comment-13475099 ] Flavio Junqueira commented on BOOKKEEPER-431: - Actually, Cookie.java defines COOKIE_NODE while AbstractZkLedgerManager defines COOKIES_NODE. I also noticed that AVAILABLE_NODE is duplicated. Is it for readability reasons? Shouldn't we have that in a single place? Duplicate definition of COOKIES_NODE Key: BOOKKEEPER-431 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-431 Project: Bookkeeper Issue Type: Improvement Affects Versions: 4.1.0 Reporter: Flavio Junqueira Priority: Minor Fix For: 4.2.0 Is it necessary two definitions of COOKIES_NODE, one in cookie.java and one in AbstractZkLedgerManager? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (BOOKKEEPER-431) Duplicate definition of COOKIES_NODE
[ https://issues.apache.org/jira/browse/BOOKKEEPER-431?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uma Maheswara Rao G reassigned BOOKKEEPER-431: -- Assignee: Uma Maheswara Rao G Duplicate definition of COOKIES_NODE Key: BOOKKEEPER-431 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-431 Project: Bookkeeper Issue Type: Improvement Affects Versions: 4.1.0 Reporter: Flavio Junqueira Assignee: Uma Maheswara Rao G Priority: Minor Fix For: 4.2.0 Is it necessary two definitions of COOKIES_NODE, one in cookie.java and one in AbstractZkLedgerManager? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (BOOKKEEPER-431) Duplicate definition of COOKIES_NODE
[ https://issues.apache.org/jira/browse/BOOKKEEPER-431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13475101#comment-13475101 ] Uma Maheswara Rao G commented on BOOKKEEPER-431: How about having a constants file and maintaining all such consts at place? If we maintain the constants inside specific files, it is very easy to duplicate the consts. Duplicate definition of COOKIES_NODE Key: BOOKKEEPER-431 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-431 Project: Bookkeeper Issue Type: Improvement Affects Versions: 4.1.0 Reporter: Flavio Junqueira Assignee: Uma Maheswara Rao G Priority: Minor Fix For: 4.2.0 Is it necessary two definitions of COOKIES_NODE, one in cookie.java and one in AbstractZkLedgerManager? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Success: ZOOKEEPER-1560 PreCommit Build #1218
Jira: https://issues.apache.org/jira/browse/ZOOKEEPER-1560 Build: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1218/ ### ## LAST 60 LINES OF THE CONSOLE ### [...truncated 262995 lines...] [exec] BUILD SUCCESSFUL [exec] Total time: 0 seconds [exec] [exec] [exec] [exec] [exec] +1 overall. Here are the results of testing the latest attachment [exec] http://issues.apache.org/jira/secure/attachment/12548908/zookeeper-1560-v7.txt [exec] against trunk revision 1391526. [exec] [exec] +1 @author. The patch does not contain any @author tags. [exec] [exec] +1 tests included. The patch appears to include 3 new or modified tests. [exec] [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] [exec] +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. [exec] [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings. [exec] [exec] +1 core tests. The patch passed core unit tests. [exec] [exec] +1 contrib tests. The patch passed contrib unit tests. [exec] [exec] Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1218//testReport/ [exec] Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1218//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html [exec] Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1218//console [exec] [exec] This message is automatically generated. [exec] [exec] [exec] == [exec] == [exec] Adding comment to Jira. [exec] == [exec] == [exec] [exec] [exec] Comment added. [exec] b2727V26Mo logged out [exec] [exec] [exec] == [exec] == [exec] Finished build. [exec] == [exec] == [exec] [exec] BUILD SUCCESSFUL Total time: 27 minutes 20 seconds Archiving artifacts Recording test results Description set: ZOOKEEPER-1560 Email was triggered for: Success Sending email for trigger: Success ### ## FAILED TESTS (if any) ## All tests passed
[jira] [Commented] (ZOOKEEPER-1560) Zookeeper client hangs on creation of large nodes
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13475134#comment-13475134 ] Hadoop QA commented on ZOOKEEPER-1560: -- +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12548908/zookeeper-1560-v7.txt against trunk revision 1391526. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1218//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1218//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1218//console This message is automatically generated. Zookeeper client hangs on creation of large nodes - Key: ZOOKEEPER-1560 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1560 Project: ZooKeeper Issue Type: Bug Components: java client Affects Versions: 3.4.4, 3.5.0 Reporter: Igor Motov Assignee: Ted Yu Fix For: 3.5.0, 3.4.5 Attachments: ZOOKEEPER-1560.patch, zookeeper-1560-v1.txt, zookeeper-1560-v2.txt, zookeeper-1560-v3.txt, zookeeper-1560-v4.txt, zookeeper-1560-v5.txt, zookeeper-1560-v6.txt, zookeeper-1560-v7.txt To reproduce, try creating a node with 0.5M of data using java client. The test will hang waiting for a response from the server. See the attached patch for the test that reproduces the issue. It seems that ZOOKEEPER-1437 introduced a few issues to {{ClientCnxnSocketNIO.doIO}} that prevent {{ClientCnxnSocketNIO}} from sending large packets that require several invocations of {{SocketChannel.write}} to complete. The first issue is that the call to {{outgoingQueue.removeFirstOccurrence(p);}} removes the packet from the queue even if the packet wasn't completely sent yet. It looks to me that this call should be moved under {{if (!pbb.hasRemaining())}} The second issue is that {{p.createBB()}} is reinitializing {{ByteBuffer}} on every iteration, which confuses {{SocketChannel.write}}. And the third issue is caused by extra calls to {{cnxn.getXid()}} that increment xid on every iteration and confuse the server. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (ZOOKEEPER-1560) Zookeeper client hangs on creation of large nodes
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13475146#comment-13475146 ] Ted Yu commented on ZOOKEEPER-1560: --- Good news was that patch v7 passed. Not so good news was that I didn't find any occurrence of the warning message I added in v7. Essentially patch v7 is the same as patch v2 - we shouldn't bail if a single sock.write() call didn't make progress. Zookeeper client hangs on creation of large nodes - Key: ZOOKEEPER-1560 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1560 Project: ZooKeeper Issue Type: Bug Components: java client Affects Versions: 3.4.4, 3.5.0 Reporter: Igor Motov Assignee: Ted Yu Fix For: 3.5.0, 3.4.5 Attachments: ZOOKEEPER-1560.patch, zookeeper-1560-v1.txt, zookeeper-1560-v2.txt, zookeeper-1560-v3.txt, zookeeper-1560-v4.txt, zookeeper-1560-v5.txt, zookeeper-1560-v6.txt, zookeeper-1560-v7.txt To reproduce, try creating a node with 0.5M of data using java client. The test will hang waiting for a response from the server. See the attached patch for the test that reproduces the issue. It seems that ZOOKEEPER-1437 introduced a few issues to {{ClientCnxnSocketNIO.doIO}} that prevent {{ClientCnxnSocketNIO}} from sending large packets that require several invocations of {{SocketChannel.write}} to complete. The first issue is that the call to {{outgoingQueue.removeFirstOccurrence(p);}} removes the packet from the queue even if the packet wasn't completely sent yet. It looks to me that this call should be moved under {{if (!pbb.hasRemaining())}} The second issue is that {{p.createBB()}} is reinitializing {{ByteBuffer}} on every iteration, which confuses {{SocketChannel.write}}. And the third issue is caused by extra calls to {{cnxn.getXid()}} that increment xid on every iteration and confuse the server. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (ZOOKEEPER-1504) Multi-thread NIOServerCnxn
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jay Shrauner updated ZOOKEEPER-1504: Attachment: ZOOKEEPER-1504.patch Rebase Multi-thread NIOServerCnxn -- Key: ZOOKEEPER-1504 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1504 Project: ZooKeeper Issue Type: Improvement Components: server Affects Versions: 3.4.3, 3.4.4, 3.5.0 Reporter: Jay Shrauner Assignee: Jay Shrauner Labels: performance, scaling Fix For: 3.5.0 Attachments: ZOOKEEPER-1504.patch, ZOOKEEPER-1504.patch, ZOOKEEPER-1504.patch, ZOOKEEPER-1504.patch NIOServerCnxnFactory is single threaded, which doesn't scale well to large numbers of clients. This is particularly noticeable when thousands of clients connect. I propose multi-threading this code as follows: - 1 acceptor thread, for accepting new connections - 1-N selector threads - 0-M I/O worker threads Numbers of threads are configurable, with defaults scaling according to number of cores. Communication with the selector threads is handled via LinkedBlockingQueues, and connections are permanently assigned to a particular selector thread so that all potentially blocking SelectionKey operations can be performed solely by the selector thread. An ExecutorService is used for the worker threads. On a 32 core machine running Linux 2.6.38, achieved best performance with 4 selector threads and 64 worker threads for a 70% +/- 5% improvement in throughput. This patch incorporates and supersedes the patches for https://issues.apache.org/jira/browse/ZOOKEEPER-517 https://issues.apache.org/jira/browse/ZOOKEEPER-1444 New classes introduced in this patch are: - ExpiryQueue (from ZOOKEEPER-1444): factor out the logic from SessionTrackerImpl used to expire sessions so that the same logic can be used to expire connections - RateLogger (from ZOOKEEPER-517): rate limit error message logging, currently only used to throttle rate of logging out of file descriptors errors - WorkerService (also in ZOOKEEPER-1505): ExecutorService wrapper that makes worker threads daemon threads and names then in an easily debuggable manner. Supports assignable threads (as used by CommitProcessor) and non-assignable threads (as used here). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: [VOTE] Release ZooKeeper 3.4.5 (candidate 0)
Patch v7 for ZOOKEEPER-1560 passes test suite. Please take a look. On Thu, Oct 11, 2012 at 2:45 PM, Mahadev Konar maha...@hortonworks.comwrote: Thanks Alex for bringing it up. Ill hold the release for now. I see a patch on 1560. Ill take a look and we'll see how to roll this into 3.4.5. thanks mahadev On Thu, Oct 11, 2012 at 2:42 PM, Alexander Shraer shra...@gmail.com wrote: Hi Mahadev, ZOOKEEPER-1560 and ZOOKEEPER-1561 indicate a potentially serious issue, introduced recently in ZOOKEEPER-1437. Please consider this w.r.t. the 3.4.5 release. Best Regards, Alex On Wed, Oct 10, 2012 at 10:38 PM, Mahadev Konar maha...@hortonworks.com wrote: I think we have waited enough. Closing the vote now. With 5 +1's (3 binding) the vote passes. I will do the needful for getting the release out. Thanks for voting folks. mahadev On Wed, Oct 10, 2012 at 9:04 AM, Flavio Junqueira f...@yahoo-inc.com wrote: +1 -Flavio On Oct 8, 2012, at 7:05 AM, Mahadev Konar wrote: Given Eugene's findings on ZOOKEEPER-1557, I think we can continue rolling the current RC out. Others please vote on the thread if you see any issues with that. Folks who have already voted, please re vote in case you have a change of opinion. As for myself, I ran a couple of tests with the RC using open jdk 7 and things seem to work. +1 from my side. Pat/Ben/Flavio/others what do you guys think? thanks mahadev On Sun, Oct 7, 2012 at 8:34 AM, Ted Yu yuzhih...@gmail.com wrote: Currently ZooKeeper_branch34_openjdk7 and ZooKeeper_branch34_jdk7 are using lock ZooKeeper-solaris. I think ZooKeeper_branch34_openjdk7 and ZooKeeper_branch34_jdk7 should use a separate lock since they wouldn't run on a Solaris machine. I didn't seem to find how a new lock name can be added. Recent builds for ZooKeeper_branch34_openjdk7 and ZooKeeper_branch34_jdk7 have been green. Cheers On Sun, Oct 7, 2012 at 6:56 AM, Patrick Hunt ph...@apache.org wrote: I've seen that before, it's a flakey test that's unrelated to the sasl stuff. Patrick On Sat, Oct 6, 2012 at 2:25 PM, Ted Yu yuzhih...@gmail.com wrote: I saw one test failure: https://builds.apache.org/view/S-Z/view/ZooKeeper/job/ZooKeeper_branch34_openjdk7/9/testReport/org.apache.zookeeper.server.quorum/QuorumPeerMainTest/testHighestZxidJoinLate/ FYI On Sat, Oct 6, 2012 at 7:16 AM, Ted Yu yuzhih...@gmail.com wrote: Up in ZOOKEEPER-1557, Eugene separated one test out and test failure seems to be gone. For ZooKeeper_branch34_jdk7, the two failed builds: #10 corresponded to ZooKeeper_branch34_openjdk7 build #7, #8 corresponded to ZooKeeper_branch34_openjdk7 build #5 where tests failed due to BindException Cheers On Sat, Oct 6, 2012 at 7:06 AM, Patrick Hunt ph...@apache.org wrote: Yes. Those ubuntu machines have two slots each. If both tests run at the same time... bam. I just added exclusion locks to the configuration of these two jobs, that should help. Patrick On Fri, Oct 5, 2012 at 8:58 PM, Ted Yu yuzhih...@gmail.com wrote: I think that was due to the following running on the same machine at the same time: Building remotely on ubuntu4 https://builds.apache.org/computer/ubuntu4 in workspace /home/jenkins/jenkins-slave/workspace/ZooKeeper_branch34_openjdk7 We should introduce randomized port so that test suite can execute in parallel. Cheers On Fri, Oct 5, 2012 at 8:55 PM, Ted Yu yuzhih...@gmail.com wrote: Some tests failed in build 8 due to (See https://builds.apache.org//view/S-Z/view/ZooKeeper/job/ZooKeeper_branch34_jdk7/8/testReport/org.apache.zookeeper.server/ZxidRolloverTest/testRolloverThenRestart/ ): java.lang.RuntimeException: java.net.BindException: Address already in use at org.apache.zookeeper.test.QuorumUtil.init(QuorumUtil.java:118) at org.apache.zookeeper.server.ZxidRolloverTest.setUp(ZxidRolloverTest.java:63) Caused by: java.net.BindException: Address already in use at sun.nio.ch.Net.bind0(Native Method) at sun.nio.ch.Net.bind(Net.java:344) at sun.nio.ch.Net.bind(Net.java:336) at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:199) at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74) at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:67) at org.apache.zookeeper.server.NIOServerCnxnFactory.configure(NIOServerCnxnFactory.java:95) at org.apache.zookeeper.server.ServerCnxnFactory.createFactory(ServerCnxnFactory.java:125) at org.apache.zookeeper.server.quorum.QuorumPeer.init(QuorumPeer.java:517) at org.apache.zookeeper.test.QuorumUtil.init(QuorumUtil.java:113) On Fri, Oct 5, 2012 at 9:56 AM, Patrick Hunt ph...@apache.org wrote: fwiw: I setup jdk7 and openjdk7 jobs last night for branch34 on
[jira] [Updated] (ZOOKEEPER-1505) Multi-thread CommitProcessor
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1505?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jay Shrauner updated ZOOKEEPER-1505: Attachment: ZOOKEEPER-1505.patch Address feedback from review--shutdown CommitProcessor if downstream processor throws an exception (preserves previous behavior) Multi-thread CommitProcessor Key: ZOOKEEPER-1505 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1505 Project: ZooKeeper Issue Type: Improvement Components: server Affects Versions: 3.4.3, 3.4.4, 3.5.0 Reporter: Jay Shrauner Assignee: Jay Shrauner Labels: performance, scaling Fix For: 3.5.0 Attachments: ZOOKEEPER-1505.patch, ZOOKEEPER-1505.patch, ZOOKEEPER-1505.patch CommitProcessor has a single thread that both pulls requests off its queues and runs all downstream processors. This is noticeably inefficient for read-intensive workloads, which could be run concurrently. The trick is handling write transactions. I propose multi-threading this code according to the following two constraints - each session must see its requests responded to in order - all committed transactions must be handled in zxid order, across all sessions I believe these cover the only constraints we need to honor. In particular, I believe we can relax the following: - it does not matter if the read request in one session happens before or after the write request in another session With these constraints, I propose the following threads - 1primary queue servicing/work dispatching thread - 0-N assignable worker threads, where a given session is always assigned to the same worker thread By assigning sessions always to the same worker thread (using a simple sessionId mod number of worker threads), we guarantee the first constraint-- requests we push onto the thread queue are processed in order. The way we guarantee the second constraint is we only allow a single commit transaction to be in flight at a time--the queue servicing thread blocks while a commit transaction is in flight, and when the transaction completes it clears the flag. On a 32 core machine running Linux 2.6.38, achieved best performance with 32 worker threads for a 56% +/- 5% improvement in throughput (this improvement was measured on top of that for ZOOKEEPER-1504, not in isolation). New classes introduced in this patch are: WorkerService (also in ZOOKEEPER-1504): ExecutorService wrapper that makes worker threads daemon threads and names then in an easily debuggable manner. Supports assignable threads (as used here) and non-assignable threads (as used by NIOServerCnxnFactory). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (ZOOKEEPER-1147) Add support for local sessions
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jay Shrauner reassigned ZOOKEEPER-1147: --- Assignee: Thawan Kooburat (was: Jay Shrauner) Add support for local sessions -- Key: ZOOKEEPER-1147 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1147 Project: ZooKeeper Issue Type: Improvement Components: server Affects Versions: 3.3.3 Reporter: Vishal Kathuria Assignee: Thawan Kooburat Labels: api-change, scaling Fix For: 3.5.0 Original Estimate: 840h Remaining Estimate: 840h This improvement is in the bucket of making ZooKeeper work at a large scale. We are planning on having about a 1 million clients connect to a ZooKeeper ensemble through a set of 50-100 observers. Majority of these clients are read only - ie they do not do any updates or create ephemeral nodes. In ZooKeeper today, the client creates a session and the session creation is handled like any other update. In the above use case, the session create/drop workload can easily overwhelm an ensemble. The following is a proposal for a local session, to support a larger number of connections. 1. The idea is to introduce a new type of session - local session. A local session doesn't have a full functionality of a normal session. 2. Local sessions cannot create ephemeral nodes. 3. Once a local session is lost, you cannot re-establish it using the session-id/password. The session and its watches are gone for good. 4. When a local session connects, the session info is only maintained on the zookeeper server (in this case, an observer) that it is connected to. The leader is not aware of the creation of such a session and there is no state written to disk. 5. The pings and expiration is handled by the server that the session is connected to. With the above changes, we can make ZooKeeper scale to a much larger number of clients without making the core ensemble a bottleneck. In terms of API, there are two options that are being considered 1. Let the client specify at the connect time which kind of session do they want. 2. All sessions connect as local sessions and automatically get promoted to global sessions when they do an operation that requires a global session (e.g. creating an ephemeral node) Chubby took the approach of lazily promoting all sessions to global, but I don't think that would work in our case, where we want to keep sessions which never create ephemeral nodes as always local. Option 2 would make it more broadly usable but option 1 would be easier to implement. We are thinking of implementing option 1 as the first cut. There would be a client flag, IsLocalSession (much like the current readOnly flag) that would be used to determine whether to create a local session or a global session. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: [VOTE] Release ZooKeeper 3.4.5 (candidate 0)
Thanks Ted. Will review the changes over the weekend. Thanks again mahadev On Fri, Oct 12, 2012 at 1:12 PM, Ted Yu yuzhih...@gmail.com wrote: Patch v7 for ZOOKEEPER-1560 passes test suite. Please take a look. On Thu, Oct 11, 2012 at 2:45 PM, Mahadev Konar maha...@hortonworks.comwrote: Thanks Alex for bringing it up. Ill hold the release for now. I see a patch on 1560. Ill take a look and we'll see how to roll this into 3.4.5. thanks mahadev On Thu, Oct 11, 2012 at 2:42 PM, Alexander Shraer shra...@gmail.com wrote: Hi Mahadev, ZOOKEEPER-1560 and ZOOKEEPER-1561 indicate a potentially serious issue, introduced recently in ZOOKEEPER-1437. Please consider this w.r.t. the 3.4.5 release. Best Regards, Alex On Wed, Oct 10, 2012 at 10:38 PM, Mahadev Konar maha...@hortonworks.com wrote: I think we have waited enough. Closing the vote now. With 5 +1's (3 binding) the vote passes. I will do the needful for getting the release out. Thanks for voting folks. mahadev On Wed, Oct 10, 2012 at 9:04 AM, Flavio Junqueira f...@yahoo-inc.com wrote: +1 -Flavio On Oct 8, 2012, at 7:05 AM, Mahadev Konar wrote: Given Eugene's findings on ZOOKEEPER-1557, I think we can continue rolling the current RC out. Others please vote on the thread if you see any issues with that. Folks who have already voted, please re vote in case you have a change of opinion. As for myself, I ran a couple of tests with the RC using open jdk 7 and things seem to work. +1 from my side. Pat/Ben/Flavio/others what do you guys think? thanks mahadev On Sun, Oct 7, 2012 at 8:34 AM, Ted Yu yuzhih...@gmail.com wrote: Currently ZooKeeper_branch34_openjdk7 and ZooKeeper_branch34_jdk7 are using lock ZooKeeper-solaris. I think ZooKeeper_branch34_openjdk7 and ZooKeeper_branch34_jdk7 should use a separate lock since they wouldn't run on a Solaris machine. I didn't seem to find how a new lock name can be added. Recent builds for ZooKeeper_branch34_openjdk7 and ZooKeeper_branch34_jdk7 have been green. Cheers On Sun, Oct 7, 2012 at 6:56 AM, Patrick Hunt ph...@apache.org wrote: I've seen that before, it's a flakey test that's unrelated to the sasl stuff. Patrick On Sat, Oct 6, 2012 at 2:25 PM, Ted Yu yuzhih...@gmail.com wrote: I saw one test failure: https://builds.apache.org/view/S-Z/view/ZooKeeper/job/ZooKeeper_branch34_openjdk7/9/testReport/org.apache.zookeeper.server.quorum/QuorumPeerMainTest/testHighestZxidJoinLate/ FYI On Sat, Oct 6, 2012 at 7:16 AM, Ted Yu yuzhih...@gmail.com wrote: Up in ZOOKEEPER-1557, Eugene separated one test out and test failure seems to be gone. For ZooKeeper_branch34_jdk7, the two failed builds: #10 corresponded to ZooKeeper_branch34_openjdk7 build #7, #8 corresponded to ZooKeeper_branch34_openjdk7 build #5 where tests failed due to BindException Cheers On Sat, Oct 6, 2012 at 7:06 AM, Patrick Hunt ph...@apache.org wrote: Yes. Those ubuntu machines have two slots each. If both tests run at the same time... bam. I just added exclusion locks to the configuration of these two jobs, that should help. Patrick On Fri, Oct 5, 2012 at 8:58 PM, Ted Yu yuzhih...@gmail.com wrote: I think that was due to the following running on the same machine at the same time: Building remotely on ubuntu4 https://builds.apache.org/computer/ubuntu4 in workspace /home/jenkins/jenkins-slave/workspace/ZooKeeper_branch34_openjdk7 We should introduce randomized port so that test suite can execute in parallel. Cheers On Fri, Oct 5, 2012 at 8:55 PM, Ted Yu yuzhih...@gmail.com wrote: Some tests failed in build 8 due to (See https://builds.apache.org//view/S-Z/view/ZooKeeper/job/ZooKeeper_branch34_jdk7/8/testReport/org.apache.zookeeper.server/ZxidRolloverTest/testRolloverThenRestart/ ): java.lang.RuntimeException: java.net.BindException: Address already in use at org.apache.zookeeper.test.QuorumUtil.init(QuorumUtil.java:118) at org.apache.zookeeper.server.ZxidRolloverTest.setUp(ZxidRolloverTest.java:63) Caused by: java.net.BindException: Address already in use at sun.nio.ch.Net.bind0(Native Method) at sun.nio.ch.Net.bind(Net.java:344) at sun.nio.ch.Net.bind(Net.java:336) at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:199) at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74) at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:67) at org.apache.zookeeper.server.NIOServerCnxnFactory.configure(NIOServerCnxnFactory.java:95) at org.apache.zookeeper.server.ServerCnxnFactory.createFactory(ServerCnxnFactory.java:125) at org.apache.zookeeper.server.quorum.QuorumPeer.init(QuorumPeer.java:517) at org.apache.zookeeper.test.QuorumUtil.init(QuorumUtil.java:113)
Failed: ZOOKEEPER-1505 PreCommit Build #1219
Jira: https://issues.apache.org/jira/browse/ZOOKEEPER-1505 Build: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1219/ ### ## LAST 60 LINES OF THE CONSOLE ### [...truncated 262152 lines...] [exec] [exec] [exec] [exec] -1 overall. Here are the results of testing the latest attachment [exec] http://issues.apache.org/jira/secure/attachment/12548952/ZOOKEEPER-1505.patch [exec] against trunk revision 1391526. [exec] [exec] +1 @author. The patch does not contain any @author tags. [exec] [exec] +1 tests included. The patch appears to include 3 new or modified tests. [exec] [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] [exec] -1 findbugs. The patch appears to introduce 1 new Findbugs (version 1.3.9) warnings. [exec] [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings. [exec] [exec] +1 core tests. The patch passed core unit tests. [exec] [exec] +1 contrib tests. The patch passed contrib unit tests. [exec] [exec] Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1219//testReport/ [exec] Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1219//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html [exec] Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1219//console [exec] [exec] This message is automatically generated. [exec] [exec] [exec] == [exec] == [exec] Adding comment to Jira. [exec] == [exec] == [exec] [exec] [exec] Comment added. [exec] vrs878qBhU logged out [exec] [exec] [exec] == [exec] == [exec] Finished build. [exec] == [exec] == [exec] [exec] BUILD FAILED /home/jenkins/jenkins-slave/workspace/PreCommit-ZOOKEEPER-Build/trunk/build.xml:1568: exec returned: 1 Total time: 27 minutes 44 seconds Build step 'Execute shell' marked build as failure Archiving artifacts Recording test results Description set: ZOOKEEPER-1505 Email was triggered for: Failure Sending email for trigger: Failure ### ## FAILED TESTS (if any) ## All tests passed
Success: ZOOKEEPER-1504 PreCommit Build #1220
Jira: https://issues.apache.org/jira/browse/ZOOKEEPER-1504 Build: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1220/ ### ## LAST 60 LINES OF THE CONSOLE ### [...truncated 266158 lines...] [exec] BUILD SUCCESSFUL [exec] Total time: 0 seconds [exec] [exec] [exec] [exec] [exec] +1 overall. Here are the results of testing the latest attachment [exec] http://issues.apache.org/jira/secure/attachment/12548950/ZOOKEEPER-1504.patch [exec] against trunk revision 1391526. [exec] [exec] +1 @author. The patch does not contain any @author tags. [exec] [exec] +1 tests included. The patch appears to include 3 new or modified tests. [exec] [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] [exec] +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. [exec] [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings. [exec] [exec] +1 core tests. The patch passed core unit tests. [exec] [exec] +1 contrib tests. The patch passed contrib unit tests. [exec] [exec] Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1220//testReport/ [exec] Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1220//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html [exec] Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1220//console [exec] [exec] This message is automatically generated. [exec] [exec] [exec] == [exec] == [exec] Adding comment to Jira. [exec] == [exec] == [exec] [exec] [exec] Comment added. [exec] 61jfuJgRdC logged out [exec] [exec] [exec] == [exec] == [exec] Finished build. [exec] == [exec] == [exec] [exec] BUILD SUCCESSFUL Total time: 27 minutes 38 seconds Archiving artifacts Recording test results Description set: ZOOKEEPER-1504 Email was triggered for: Success Sending email for trigger: Success ### ## FAILED TESTS (if any) ## All tests passed
[jira] [Commented] (ZOOKEEPER-1505) Multi-thread CommitProcessor
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13475444#comment-13475444 ] Jay Shrauner commented on ZOOKEEPER-1505: - Findbug warning (naked notify) is bogus; this is a helper routine to wakeup the main thread with the state change happening in the routines that call it. From the blurb in findbug: This bug does not necessarily indicate an error, since the change to mutable object state may have taken place in a method which then called the method containing the notification. Multi-thread CommitProcessor Key: ZOOKEEPER-1505 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1505 Project: ZooKeeper Issue Type: Improvement Components: server Affects Versions: 3.4.3, 3.4.4, 3.5.0 Reporter: Jay Shrauner Assignee: Jay Shrauner Labels: performance, scaling Fix For: 3.5.0 Attachments: ZOOKEEPER-1505.patch, ZOOKEEPER-1505.patch, ZOOKEEPER-1505.patch CommitProcessor has a single thread that both pulls requests off its queues and runs all downstream processors. This is noticeably inefficient for read-intensive workloads, which could be run concurrently. The trick is handling write transactions. I propose multi-threading this code according to the following two constraints - each session must see its requests responded to in order - all committed transactions must be handled in zxid order, across all sessions I believe these cover the only constraints we need to honor. In particular, I believe we can relax the following: - it does not matter if the read request in one session happens before or after the write request in another session With these constraints, I propose the following threads - 1primary queue servicing/work dispatching thread - 0-N assignable worker threads, where a given session is always assigned to the same worker thread By assigning sessions always to the same worker thread (using a simple sessionId mod number of worker threads), we guarantee the first constraint-- requests we push onto the thread queue are processed in order. The way we guarantee the second constraint is we only allow a single commit transaction to be in flight at a time--the queue servicing thread blocks while a commit transaction is in flight, and when the transaction completes it clears the flag. On a 32 core machine running Linux 2.6.38, achieved best performance with 32 worker threads for a 56% +/- 5% improvement in throughput (this improvement was measured on top of that for ZOOKEEPER-1504, not in isolation). New classes introduced in this patch are: WorkerService (also in ZOOKEEPER-1504): ExecutorService wrapper that makes worker threads daemon threads and names then in an easily debuggable manner. Supports assignable threads (as used here) and non-assignable threads (as used by NIOServerCnxnFactory). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: Review Request: Multi-thread NIOServerCnxn
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/6256/ --- (Updated Oct. 12, 2012, 11:45 p.m.) Review request for zookeeper and Patrick Hunt. Changes --- Rebase Description --- See https://issues.apache.org/jira/browse/ZOOKEEPER-1504 This addresses bug ZOOKEEPER-1504. https://issues.apache.org/jira/browse/ZOOKEEPER-1504 Diffs (updated) - /src/java/main/org/apache/zookeeper/server/ExpiryQueue.java PRE-CREATION /src/java/main/org/apache/zookeeper/server/NIOServerCnxn.java 1391526 /src/java/main/org/apache/zookeeper/server/NIOServerCnxnFactory.java 1391526 /src/java/main/org/apache/zookeeper/server/RateLogger.java PRE-CREATION /src/java/main/org/apache/zookeeper/server/ServerCnxn.java 1391526 /src/java/main/org/apache/zookeeper/server/ServerCnxnFactory.java 1391526 /src/java/main/org/apache/zookeeper/server/SessionTrackerImpl.java 1391526 /src/java/main/org/apache/zookeeper/server/WorkerService.java PRE-CREATION /src/java/main/org/apache/zookeeper/server/ZooKeeperServer.java 1391526 /src/java/test/org/apache/zookeeper/test/ServerCnxnTest.java PRE-CREATION Diff: https://reviews.apache.org/r/6256/diff/ Testing --- Thanks, Jay Shrauner
Re: Review Request: Multi-thread CommitProcessor
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/6260/ --- (Updated Oct. 12, 2012, 11:47 p.m.) Review request for zookeeper and Patrick Hunt. Changes --- Address feedback from review--shutdown CommitProcessor if downstream processor throws an exception (preserves previous behavior) Description --- See https://issues.apache.org/jira/browse/ZOOKEEPER-1505 This addresses bug ZOOKEEPER-1505. https://issues.apache.org/jira/browse/ZOOKEEPER-1505 Diffs (updated) - /src/java/main/org/apache/zookeeper/server/FinalRequestProcessor.java 1391526 /src/java/main/org/apache/zookeeper/server/ServerCnxnFactory.java 1391526 /src/java/main/org/apache/zookeeper/server/WorkerService.java PRE-CREATION /src/java/main/org/apache/zookeeper/server/quorum/CommitProcessor.java 1391526 /src/java/main/org/apache/zookeeper/server/quorum/Leader.java 1391526 /src/java/test/org/apache/zookeeper/server/quorum/CommitProcessorTest.java PRE-CREATION Diff: https://reviews.apache.org/r/6260/diff/ Testing --- Thanks, Jay Shrauner
[jira] [Created] (ZOOKEEPER-1562) Memory leaks in zoo_multi API
Deepak Jagtap created ZOOKEEPER-1562: Summary: Memory leaks in zoo_multi API Key: ZOOKEEPER-1562 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1562 Project: ZooKeeper Issue Type: Bug Components: c client Affects Versions: 3.4.3, 3.4.4 Environment: Zookeeper client and server both are running on CentOS 6.3 Reporter: Deepak Jagtap Priority: Trivial Valgrind is reporting memory leak for zoo_multi operations. ==4056== 2,240 (160 direct, 2,080 indirect) bytes in 1 blocks are definitely lost in loss record 18 of 24 ==4056==at 0x4A04A28: calloc (vg_replace_malloc.c:467) ==4056==by 0x504D822: create_completion_entry (zookeeper.c:2322) ==4056==by 0x5052833: zoo_amulti (zookeeper.c:3141) ==4056==by 0x5052A8B: zoo_multi (zookeeper.c:3240) It looks like completion entries for individual operations in multiupdate transaction are not getting freed. My observation is that memory leak size depends on the number of operations in single mutlipupdate transaction -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (ZOOKEEPER-1355) Add zk.updateServerList(newServerList)
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marshall McMullen updated ZOOKEEPER-1355: - Attachment: ZOOKEEPER-1355-12-Oct.patch This is an updated version of patch that applies cleanly to the latest tip of trunk. Add zk.updateServerList(newServerList) --- Key: ZOOKEEPER-1355 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1355 Project: ZooKeeper Issue Type: New Feature Components: c client, java client Reporter: Alexander Shraer Assignee: Alexander Shraer Fix For: 3.5.0 Attachments: loadbalancing-more-details.pdf, loadbalancing.pdf, ZOOKEEPER-1355-10-Oct.patch, ZOOKEEPER-1355-12-Oct.patch, ZOOKEEPER-1355-ver10-1.patch, ZOOKEEPER-1355-ver10-2.patch, ZOOKEEPER-1355-ver10-3.patch, ZOOKEEPER-1355-ver10-4.patch, ZOOKEEPER-1355-ver10-4.patch, ZOOKEEPER-1355-ver10.patch, ZOOKEEPER-1355-ver11-1.patch, ZOOKEEPER-1355-ver11.patch, ZOOKEEPER-1355-ver12-1.patch, ZOOKEEPER-1355-ver12-2.patch, ZOOKEEPER-1355-ver12-4.patch, ZOOKEEPER-1355-ver12.patch, ZOOKEEPER-1355-ver13.patch, ZOOKEEPER-1355-ver14.patch, ZOOKEEPER-1355-ver2.patch, ZOOKEEPER=1355-ver3.patch, ZOOKEEPER-1355-ver4.patch, ZOOKEEPER-1355-ver5.patch, ZOOKEEPER-1355-ver6.patch, ZOOKEEPER-1355-ver7.patch, ZOOKEEPER-1355-ver8.patch, ZOOKEEPER-1355-ver9-1.patch, ZOOKEEPER-1355-ver9.patch, ZOOOKEEPER-1355.patch, ZOOOKEEPER-1355-test.patch, ZOOOKEEPER-1355-ver1.patch When the set of servers changes, we would like to update the server list stored by clients without restarting the clients. Moreover, assuming that the number of clients per server is the same (in expectation) in the old configuration (as guaranteed by the current list shuffling for example), we would like to re-balance client connections across the new set of servers in a way that a) the number of clients per server is the same for all servers (in expectation) and b) there is no excessive/unnecessary client migration. It is simple to achieve (a) without (b) - just re-shuffle the new list of servers at every client. But this would create unnecessary migration, which we'd like to avoid. We propose a simple probabilistic migration scheme that achieves (a) and (b) - each client locally decides whether and where to migrate when the list of servers changes. The attached document describes the scheme and shows an evaluation of it in Zookeeper. We also implemented re-balancing through a consistent-hashing scheme and show a comparison. We derived the probabilistic migration rules from a simple formula that we can also provide, if someone's interested in the proof. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Failed: ZOOKEEPER-1355 PreCommit Build #1221
Jira: https://issues.apache.org/jira/browse/ZOOKEEPER-1355 Build: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1221/ ### ## LAST 60 LINES OF THE CONSOLE ### [...truncated 607 lines...] [exec] [exec] [exec] [exec] -1 overall. Here are the results of testing the latest attachment [exec] http://issues.apache.org/jira/secure/attachment/12548996/ZOOKEEPER-1355-12-Oct.patch [exec] against trunk revision 1391526. [exec] [exec] +1 @author. The patch does not contain any @author tags. [exec] [exec] +1 tests included. The patch appears to include 34 new or modified tests. [exec] [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] [exec] -1 javac. The patch appears to cause tar ant target to fail. [exec] [exec] +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. [exec] [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings. [exec] [exec] -1 core tests. The patch failed core unit tests. [exec] [exec] +1 contrib tests. The patch passed contrib unit tests. [exec] [exec] Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1221//testReport/ [exec] Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1221//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html [exec] Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1221//console [exec] [exec] This message is automatically generated. [exec] [exec] [exec] == [exec] == [exec] Adding comment to Jira. [exec] == [exec] == [exec] [exec] [exec] Comment added. [exec] Wzr34544w1 logged out [exec] [exec] [exec] == [exec] == [exec] Finished build. [exec] == [exec] == [exec] [exec] BUILD FAILED /home/jenkins/jenkins-slave/workspace/PreCommit-ZOOKEEPER-Build/trunk/build.xml:1568: exec returned: 2 Total time: 2 minutes 1 second Build step 'Execute shell' marked build as failure Archiving artifacts Recording test results Description set: ZOOKEEPER-1355 Email was triggered for: Failure Sending email for trigger: Failure ### ## FAILED TESTS (if any) ## No tests ran.
[jira] [Commented] (ZOOKEEPER-1355) Add zk.updateServerList(newServerList)
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13475510#comment-13475510 ] Hadoop QA commented on ZOOKEEPER-1355: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12548996/ZOOKEEPER-1355-12-Oct.patch against trunk revision 1391526. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 34 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. -1 javac. The patch appears to cause tar ant target to fail. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1221//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1221//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1221//console This message is automatically generated. Add zk.updateServerList(newServerList) --- Key: ZOOKEEPER-1355 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1355 Project: ZooKeeper Issue Type: New Feature Components: c client, java client Reporter: Alexander Shraer Assignee: Alexander Shraer Fix For: 3.5.0 Attachments: loadbalancing-more-details.pdf, loadbalancing.pdf, ZOOKEEPER-1355-10-Oct.patch, ZOOKEEPER-1355-12-Oct.patch, ZOOKEEPER-1355-ver10-1.patch, ZOOKEEPER-1355-ver10-2.patch, ZOOKEEPER-1355-ver10-3.patch, ZOOKEEPER-1355-ver10-4.patch, ZOOKEEPER-1355-ver10-4.patch, ZOOKEEPER-1355-ver10.patch, ZOOKEEPER-1355-ver11-1.patch, ZOOKEEPER-1355-ver11.patch, ZOOKEEPER-1355-ver12-1.patch, ZOOKEEPER-1355-ver12-2.patch, ZOOKEEPER-1355-ver12-4.patch, ZOOKEEPER-1355-ver12.patch, ZOOKEEPER-1355-ver13.patch, ZOOKEEPER-1355-ver14.patch, ZOOKEEPER-1355-ver2.patch, ZOOKEEPER=1355-ver3.patch, ZOOKEEPER-1355-ver4.patch, ZOOKEEPER-1355-ver5.patch, ZOOKEEPER-1355-ver6.patch, ZOOKEEPER-1355-ver7.patch, ZOOKEEPER-1355-ver8.patch, ZOOKEEPER-1355-ver9-1.patch, ZOOKEEPER-1355-ver9.patch, ZOOOKEEPER-1355.patch, ZOOOKEEPER-1355-test.patch, ZOOOKEEPER-1355-ver1.patch When the set of servers changes, we would like to update the server list stored by clients without restarting the clients. Moreover, assuming that the number of clients per server is the same (in expectation) in the old configuration (as guaranteed by the current list shuffling for example), we would like to re-balance client connections across the new set of servers in a way that a) the number of clients per server is the same for all servers (in expectation) and b) there is no excessive/unnecessary client migration. It is simple to achieve (a) without (b) - just re-shuffle the new list of servers at every client. But this would create unnecessary migration, which we'd like to avoid. We propose a simple probabilistic migration scheme that achieves (a) and (b) - each client locally decides whether and where to migrate when the list of servers changes. The attached document describes the scheme and shows an evaluation of it in Zookeeper. We also implemented re-balancing through a consistent-hashing scheme and show a comparison. We derived the probabilistic migration rules from a simple formula that we can also provide, if someone's interested in the proof. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (ZOOKEEPER-1355) Add zk.updateServerList(newServerList)
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marshall McMullen updated ZOOKEEPER-1355: - Attachment: ZOOKEEPER-1355-13-Oct.patch Had meant to remove the Zab Test part from this patch as Alex tells me that was already committed to trunk under another Jira. Add zk.updateServerList(newServerList) --- Key: ZOOKEEPER-1355 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1355 Project: ZooKeeper Issue Type: New Feature Components: c client, java client Reporter: Alexander Shraer Assignee: Alexander Shraer Fix For: 3.5.0 Attachments: loadbalancing-more-details.pdf, loadbalancing.pdf, ZOOKEEPER-1355-10-Oct.patch, ZOOKEEPER-1355-12-Oct.patch, ZOOKEEPER-1355-13-Oct.patch, ZOOKEEPER-1355-ver10-1.patch, ZOOKEEPER-1355-ver10-2.patch, ZOOKEEPER-1355-ver10-3.patch, ZOOKEEPER-1355-ver10-4.patch, ZOOKEEPER-1355-ver10-4.patch, ZOOKEEPER-1355-ver10.patch, ZOOKEEPER-1355-ver11-1.patch, ZOOKEEPER-1355-ver11.patch, ZOOKEEPER-1355-ver12-1.patch, ZOOKEEPER-1355-ver12-2.patch, ZOOKEEPER-1355-ver12-4.patch, ZOOKEEPER-1355-ver12.patch, ZOOKEEPER-1355-ver13.patch, ZOOKEEPER-1355-ver14.patch, ZOOKEEPER-1355-ver2.patch, ZOOKEEPER=1355-ver3.patch, ZOOKEEPER-1355-ver4.patch, ZOOKEEPER-1355-ver5.patch, ZOOKEEPER-1355-ver6.patch, ZOOKEEPER-1355-ver7.patch, ZOOKEEPER-1355-ver8.patch, ZOOKEEPER-1355-ver9-1.patch, ZOOKEEPER-1355-ver9.patch, ZOOOKEEPER-1355.patch, ZOOOKEEPER-1355-test.patch, ZOOOKEEPER-1355-ver1.patch When the set of servers changes, we would like to update the server list stored by clients without restarting the clients. Moreover, assuming that the number of clients per server is the same (in expectation) in the old configuration (as guaranteed by the current list shuffling for example), we would like to re-balance client connections across the new set of servers in a way that a) the number of clients per server is the same for all servers (in expectation) and b) there is no excessive/unnecessary client migration. It is simple to achieve (a) without (b) - just re-shuffle the new list of servers at every client. But this would create unnecessary migration, which we'd like to avoid. We propose a simple probabilistic migration scheme that achieves (a) and (b) - each client locally decides whether and where to migrate when the list of servers changes. The attached document describes the scheme and shows an evaluation of it in Zookeeper. We also implemented re-balancing through a consistent-hashing scheme and show a comparison. We derived the probabilistic migration rules from a simple formula that we can also provide, if someone's interested in the proof. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Jenkins build is still unstable: bookkeeper-trunk #750
See https://builds.apache.org/job/bookkeeper-trunk/750/
[jira] [Updated] (BOOKKEEPER-430) Remove manual bookie registration from overview
[ https://issues.apache.org/jira/browse/BOOKKEEPER-430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Flavio Junqueira updated BOOKKEEPER-430: Attachment: BOOKKEEPER-430.patch Remove manual bookie registration from overview --- Key: BOOKKEEPER-430 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-430 Project: Bookkeeper Issue Type: Improvement Affects Versions: 4.1.0 Reporter: Flavio Junqueira Assignee: Flavio Junqueira Attachments: BOOKKEEPER-430.patch The documentation suggests that a user needs to manually register a bookie, which is not right. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (BOOKKEEPER-430) Remove manual bookie registration from overview
[ https://issues.apache.org/jira/browse/BOOKKEEPER-430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Flavio Junqueira updated BOOKKEEPER-430: Component/s: Documentation Remove manual bookie registration from overview --- Key: BOOKKEEPER-430 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-430 Project: Bookkeeper Issue Type: Improvement Components: Documentation Affects Versions: 4.1.0 Reporter: Flavio Junqueira Assignee: Flavio Junqueira Attachments: BOOKKEEPER-430.patch The documentation suggests that a user needs to manually register a bookie, which is not right. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (BOOKKEEPER-431) Duplicate definition of COOKIES_NODE
[ https://issues.apache.org/jira/browse/BOOKKEEPER-431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13475113#comment-13475113 ] Flavio Junqueira commented on BOOKKEEPER-431: - Yeah, I'd rather have the constants in one single place. COOKIE_NODE was introduced in BOOKKEEPER-263, but before that we had BOOKIE_COOKIE_PATH, so I'm not entirely sure what the history of duplication is. Ivan, Sijie, do you guys have any other insight to add here? Duplicate definition of COOKIES_NODE Key: BOOKKEEPER-431 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-431 Project: Bookkeeper Issue Type: Improvement Affects Versions: 4.1.0 Reporter: Flavio Junqueira Assignee: Uma Maheswara Rao G Priority: Minor Fix For: 4.2.0 Is it necessary two definitions of COOKIES_NODE, one in cookie.java and one in AbstractZkLedgerManager? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (BOOKKEEPER-432) Improve performance of entry log range read per ledger entries
Yixue (Andrew) Zhu created BOOKKEEPER-432: - Summary: Improve performance of entry log range read per ledger entries Key: BOOKKEEPER-432 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-432 Project: Bookkeeper Issue Type: Improvement Components: bookkeeper-server Affects Versions: 4.1.0 Environment: Linux Reporter: Yixue (Andrew) Zhu We observed random I/O reads when some subscribers fall behind (on some topics), as delivery needs to scan the entry logs (thru ledger index), which are interleaved with ledger entries across all ledgers being served. Essentially, the ledger index is a non-clustered index. It is not effective when a large number of ledger entries need to be served, which tend to be scattered around due to interleaving. Some possible improvements: 1. Change the ledger entries buffer to use a SkipList (or other suitable), sorted on (ledger, entry sequence). When the buffer is flushed, the entry log is written out in the already-sorted order. The active ledger index can point to the entries buffer (SkipList), and fixed up with entry-log position once latter is persisted. Or, the ledger index can be just rebuilt on demand. The entry log file tail can have index attached (light-weight b-tree, similar with big-table). We need to track per ledger which log files contribute entries to it, so that in-memory index can be rebuilt from the tails of corresponding log files. 2. Use affinity concept to make ensembles of ledgers (belonging to same topic) as identical as possible. This will help above 1. be more effective. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (BOOKKEEPER-432) Improve performance of entry log range read per ledger entries
[ https://issues.apache.org/jira/browse/BOOKKEEPER-432?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yixue (Andrew) Zhu updated BOOKKEEPER-432: -- Affects Version/s: (was: 4.1.0) 4.2.0 Improve performance of entry log range read per ledger entries --- Key: BOOKKEEPER-432 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-432 Project: Bookkeeper Issue Type: Improvement Components: bookkeeper-server Affects Versions: 4.2.0 Environment: Linux Reporter: Yixue (Andrew) Zhu Labels: patch We observed random I/O reads when some subscribers fall behind (on some topics), as delivery needs to scan the entry logs (thru ledger index), which are interleaved with ledger entries across all ledgers being served. Essentially, the ledger index is a non-clustered index. It is not effective when a large number of ledger entries need to be served, which tend to be scattered around due to interleaving. Some possible improvements: 1. Change the ledger entries buffer to use a SkipList (or other suitable), sorted on (ledger, entry sequence). When the buffer is flushed, the entry log is written out in the already-sorted order. The active ledger index can point to the entries buffer (SkipList), and fixed up with entry-log position once latter is persisted. Or, the ledger index can be just rebuilt on demand. The entry log file tail can have index attached (light-weight b-tree, similar with big-table). We need to track per ledger which log files contribute entries to it, so that in-memory index can be rebuilt from the tails of corresponding log files. 2. Use affinity concept to make ensembles of ledgers (belonging to same topic) as identical as possible. This will help above 1. be more effective. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira