[jira] [Commented] (ZOOKEEPER-2680) Correct DataNode.getChildren() inconsistent behavior.
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15842327#comment-15842327 ] Hadoop QA commented on ZOOKEEPER-2680: -- +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12849650/ZOOKEEPER-2680-01.patch against trunk revision 8771ffdaacb87126a485ae740558f6a288ab980b. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 2 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 3.0.1) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/3571//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/3571//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/3571//console This message is automatically generated. > Correct DataNode.getChildren() inconsistent behavior. > - > > Key: ZOOKEEPER-2680 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2680 > Project: ZooKeeper > Issue Type: Bug > Components: server >Affects Versions: 3.4.9, 3.5.1 >Reporter: Mohammad Arshad >Assignee: Mohammad Arshad > Fix For: 3.4.10, 3.5.3, 3.6.0 > > Attachments: ZOOKEEPER-2680-01.patch > > > DataNode.getChildren() API returns null and empty set if there are no > children in it depending on when the API is called. DataNode.getChildren() > API behavior should be changed and it should always return empty set if the > node does not have any child > *DataNode.getChildren() API Current Behavior:* > # returns null initially > When DataNode is created and no children are added yet, > DataNode.getChildren() returns null > # returns empty set after all the children are deleted: > created a Node > add a child > delete the child > DataNode.getChildren() returns empty set. > After fix DataNode.getChildren() should return empty set in all the above > cases. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Success: ZOOKEEPER-2680 PreCommit Build #3571
Jira: https://issues.apache.org/jira/browse/ZOOKEEPER-2680 Build: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/3571/ ### ## LAST 60 LINES OF THE CONSOLE ### [...truncated 489638 lines...] [exec] [exec] +1 @author. The patch does not contain any @author tags. [exec] [exec] +1 tests included. The patch appears to include 2 new or modified tests. [exec] [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] [exec] +1 findbugs. The patch does not introduce any new Findbugs (version 3.0.1) warnings. [exec] [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings. [exec] [exec] +1 core tests. The patch passed core unit tests. [exec] [exec] +1 contrib tests. The patch passed contrib unit tests. [exec] [exec] Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/3571//testReport/ [exec] Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/3571//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html [exec] Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/3571//console [exec] [exec] This message is automatically generated. [exec] [exec] [exec] == [exec] == [exec] Adding comment to Jira. [exec] == [exec] == [exec] [exec] [exec] Comment added. [exec] 5c27f3deaa1f8201df04bad536b334004f3d69d3 logged out [exec] [exec] [exec] == [exec] == [exec] Finished build. [exec] == [exec] == [exec] [exec] [exec] mv: '/home/jenkins/jenkins-slave/workspace/PreCommit-ZOOKEEPER-Build/patchprocess' and '/home/jenkins/jenkins-slave/workspace/PreCommit-ZOOKEEPER-Build/patchprocess' are the same file BUILD SUCCESSFUL Total time: 21 minutes 1 second Archiving artifacts Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7 Recording test results Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7 Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7 [description-setter] Description set: ZOOKEEPER-2680 Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7 Email was triggered for: Success Sending email for trigger: Success Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7 Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7 Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7 Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7 ### ## FAILED TESTS (if any) ## All tests passed
[jira] [Commented] (ZOOKEEPER-2464) NullPointerException on ContainerManager
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15842312#comment-15842312 ] Mohammad Arshad commented on ZOOKEEPER-2464: [~randgalt], I created ZOOKEEPER-2680. After ZOOKEEPER-2680 fix this issue will get automatically fixed. > NullPointerException on ContainerManager > > > Key: ZOOKEEPER-2464 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2464 > Project: ZooKeeper > Issue Type: Bug > Components: server >Affects Versions: 3.5.1 >Reporter: Stefano Salmaso >Assignee: Jordan Zimmerman > Fix For: 3.5.3, 3.6.0 > > Attachments: ContainerManagerTest.java, ZOOKEEPER-2464.patch > > > I would like to expose you to a problem that we are experiencing. > We are using a cluster of 7 zookeeper and we use them to implement a > distributed lock using Curator > (http://curator.apache.org/curator-recipes/shared-reentrant-lock.html) > So .. we tried to play with the servers to see if everything worked properly > and we stopped and start servers to see that the system worked well > (like stop 03, stop 05, stop 06, start 05, start 06, start 03) > We saw a strange behavior. > The number of znodes grew up without stopping (normally we had 4000 or 5000, > we got to 60,000 and then we stopped our application) > In zookeeeper logs I saw this (on leader only, one every minute) > 2016-07-04 14:53:50,302 [myid:7] - ERROR > [ContainerManagerTask:ContainerManager$1@84] - Error checking containers > java.lang.NullPointerException >at > org.apache.zookeeper.server.ContainerManager.getCandidates(ContainerManager.java:151) >at > org.apache.zookeeper.server.ContainerManager.checkContainers(ContainerManager.java:111) >at > org.apache.zookeeper.server.ContainerManager$1.run(ContainerManager.java:78) >at java.util.TimerThread.mainLoop(Timer.java:555) >at java.util.TimerThread.run(Timer.java:505) > We have not yet deleted the data ... so the problem can be reproduced on our > servers -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (ZOOKEEPER-2680) Correct DataNode.getChildren() inconsistent behavior.
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mohammad Arshad updated ZOOKEEPER-2680: --- Fix Version/s: 3.6.0 3.5.3 3.4.10 > Correct DataNode.getChildren() inconsistent behavior. > - > > Key: ZOOKEEPER-2680 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2680 > Project: ZooKeeper > Issue Type: Bug > Components: server >Affects Versions: 3.4.9, 3.5.1 >Reporter: Mohammad Arshad >Assignee: Mohammad Arshad > Fix For: 3.4.10, 3.5.3, 3.6.0 > > Attachments: ZOOKEEPER-2680-01.patch > > > DataNode.getChildren() API returns null and empty set if there are no > children in it depending on when the API is called. DataNode.getChildren() > API behavior should be changed and it should always return empty set if the > node does not have any child > *DataNode.getChildren() API Current Behavior:* > # returns null initially > When DataNode is created and no children are added yet, > DataNode.getChildren() returns null > # returns empty set after all the children are deleted: > created a Node > add a child > delete the child > DataNode.getChildren() returns empty set. > After fix DataNode.getChildren() should return empty set in all the above > cases. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (ZOOKEEPER-2680) Correct DataNode.getChildren() inconsistent behavior.
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mohammad Arshad updated ZOOKEEPER-2680: --- Attachment: ZOOKEEPER-2680-01.patch > Correct DataNode.getChildren() inconsistent behavior. > - > > Key: ZOOKEEPER-2680 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2680 > Project: ZooKeeper > Issue Type: Bug > Components: server >Affects Versions: 3.4.9, 3.5.1 >Reporter: Mohammad Arshad >Assignee: Mohammad Arshad > Attachments: ZOOKEEPER-2680-01.patch > > > DataNode.getChildren() API returns null and empty set if there are no > children in it depending on when the API is called. DataNode.getChildren() > API behavior should be changed and it should always return empty set if the > node does not have any child > *DataNode.getChildren() API Current Behavior:* > # returns null initially > When DataNode is created and no children are added yet, > DataNode.getChildren() returns null > # returns empty set after all the children are deleted: > created a Node > add a child > delete the child > DataNode.getChildren() returns empty set. > After fix DataNode.getChildren() should return empty set in all the above > cases. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (ZOOKEEPER-2680) Correct DataNode.getChildren() inconsistent behavior.
Mohammad Arshad created ZOOKEEPER-2680: -- Summary: Correct DataNode.getChildren() inconsistent behavior. Key: ZOOKEEPER-2680 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2680 Project: ZooKeeper Issue Type: Bug Components: server Affects Versions: 3.5.1, 3.4.9 Reporter: Mohammad Arshad Assignee: Mohammad Arshad DataNode.getChildren() API returns null and empty set if there are no children in it depending on when the API is called. DataNode.getChildren() API behavior should be changed and it should always return empty set if the node does not have any child *DataNode.getChildren() API Current Behavior:* # returns null initially When DataNode is created and no children are added yet, DataNode.getChildren() returns null # returns empty set after all the children are deleted: created a Node add a child delete the child DataNode.getChildren() returns empty set. After fix DataNode.getChildren() should return empty set in all the above cases. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (ZOOKEEPER-2044) CancelledKeyException in zookeeper 3.4.5
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15841038#comment-15841038 ] ASF GitHub Bot commented on ZOOKEEPER-2044: --- Github user rakeshadr commented on a diff in the pull request: https://github.com/apache/zookeeper/pull/156#discussion_r98146418 --- Diff: src/java/test/org/apache/zookeeper/server/NIOServerCnxnTest.java --- @@ -68,4 +69,41 @@ public void testOperationsAfterCnxnClose() throws IOException, zk.close(); } } + +/** + * Mock extension of NIOServerCnxn to test for + * CancelledKeyException (ZOOKEEPER-2044). + */ +private static class MockNIOServerCnxn extends NIOServerCnxn { +public MockNIOServerCnxn(NIOServerCnxn cnxn) +throws IOException { +super(cnxn.zkServer, cnxn.sock, cnxn.sk, cnxn.factory); +} + +public void mockSendBuffer(ByteBuffer bb) throws Exception { +super.internalSendBuffer(bb); +} +} + +@Test(timeout = 3) +public void testValidSelectionKey() throws Exception { +int oldTimeout = ClientBase.CONNECTION_TIMEOUT; +ClientBase.CONNECTION_TIMEOUT = 3000; +final ZooKeeper zk = createClient(); --- End diff -- Thanks @hanm for the analysis and fixing it. Instead of directly changing the static value, how about simplifying the ZooKeeper client creation like below, ``final ZooKeeper zk = createZKClient(hostPort, 3000);`` > CancelledKeyException in zookeeper 3.4.5 > > > Key: ZOOKEEPER-2044 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2044 > Project: ZooKeeper > Issue Type: Bug > Components: server >Affects Versions: 3.4.6 > Environment: Red Hat Enterprise Linux Server release 6.2 >Reporter: shamjith antholi >Assignee: Flavio Junqueira >Priority: Minor > Fix For: 3.4.10 > > Attachments: ZOOKEEPER-2044.patch, ZOOKEEPER-2044.patch > > > I am getting cancelled key exception in zookeeper (version 3.4.5). Please see > the log below. When this error is thrown, the connected solr shard is going > down by giving the error "Failed to index metadata in > Solr,StackTrace=SolrError: HTTP status 503.Reason: > {"responseHeader":{"status":503,"QTime":204},"error":{"msg":"ClusterState > says we are the leader, but locally we don't think so","code":503" and > ultimately the current activity is going down. Could you please give a > solution for this ? > Zookeper log > -- > 2014-09-16 02:58:47,799 [myid:1] - INFO > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:ZooKeeperServer@832] - Client > attempting to renew session 0x24868e7ca980003 at /172.22.0.5:58587 > 2014-09-16 02:58:47,800 [myid:1] - INFO > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:Learner@107] - Revalidating > client: 0x24868e7ca980003 > 2014-09-16 02:58:47,802 [myid:1] - INFO > [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:ZooKeeperServer@588] - Invalid > session 0x24868e7ca980003 for client /172.22.0.5:58587, probably expired > 2014-09-16 02:58:47,803 [myid:1] - INFO > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1001] - Closed > socket connection for client /172.22.0.5:58587 which had sessionid > 0x24868e7ca980003 > 2014-09-16 02:58:47,810 [myid:1] - ERROR > [CommitProcessor:1:NIOServerCnxn@180] - Unexpected Exception: > java.nio.channels.CancelledKeyException > at sun.nio.ch.SelectionKeyImpl.ensureValid(SelectionKeyImpl.java:55) > at sun.nio.ch.SelectionKeyImpl.interestOps(SelectionKeyImpl.java:59) > at > org.apache.zookeeper.server.NIOServerCnxn.sendBuffer(NIOServerCnxn.java:153) > at > org.apache.zookeeper.server.NIOServerCnxn.sendResponse(NIOServerCnxn.java:1076) > at > org.apache.zookeeper.server.NIOServerCnxn.process(NIOServerCnxn.java:1113) > at org.apache.zookeeper.server.DataTree.setWatches(DataTree.java:1327) > at > org.apache.zookeeper.server.ZKDatabase.setWatches(ZKDatabase.java:384) > at > org.apache.zookeeper.server.FinalRequestProcessor.processRequest(FinalRequestProcessor.java:304) > at > org.apache.zookeeper.server.quorum.CommitProcessor.run(CommitProcessor.java:74) > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] zookeeper pull request #156: ZOOKEEPER-2044:CancelledKeyException in zookeep...
Github user rakeshadr commented on a diff in the pull request: https://github.com/apache/zookeeper/pull/156#discussion_r98146418 --- Diff: src/java/test/org/apache/zookeeper/server/NIOServerCnxnTest.java --- @@ -68,4 +69,41 @@ public void testOperationsAfterCnxnClose() throws IOException, zk.close(); } } + +/** + * Mock extension of NIOServerCnxn to test for + * CancelledKeyException (ZOOKEEPER-2044). + */ +private static class MockNIOServerCnxn extends NIOServerCnxn { +public MockNIOServerCnxn(NIOServerCnxn cnxn) +throws IOException { +super(cnxn.zkServer, cnxn.sock, cnxn.sk, cnxn.factory); +} + +public void mockSendBuffer(ByteBuffer bb) throws Exception { +super.internalSendBuffer(bb); +} +} + +@Test(timeout = 3) +public void testValidSelectionKey() throws Exception { +int oldTimeout = ClientBase.CONNECTION_TIMEOUT; +ClientBase.CONNECTION_TIMEOUT = 3000; +final ZooKeeper zk = createClient(); --- End diff -- Thanks @hanm for the analysis and fixing it. Instead of directly changing the static value, how about simplifying the ZooKeeper client creation like below, ``final ZooKeeper zk = createZKClient(hostPort, 3000);`` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (ZOOKEEPER-2464) NullPointerException on ContainerManager
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15841002#comment-15841002 ] Jordan Zimmerman commented on ZOOKEEPER-2464: - [~arshad.mohammad] IMO it should be a separate issue. > NullPointerException on ContainerManager > > > Key: ZOOKEEPER-2464 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2464 > Project: ZooKeeper > Issue Type: Bug > Components: server >Affects Versions: 3.5.1 >Reporter: Stefano Salmaso >Assignee: Jordan Zimmerman > Fix For: 3.5.3, 3.6.0 > > Attachments: ContainerManagerTest.java, ZOOKEEPER-2464.patch > > > I would like to expose you to a problem that we are experiencing. > We are using a cluster of 7 zookeeper and we use them to implement a > distributed lock using Curator > (http://curator.apache.org/curator-recipes/shared-reentrant-lock.html) > So .. we tried to play with the servers to see if everything worked properly > and we stopped and start servers to see that the system worked well > (like stop 03, stop 05, stop 06, start 05, start 06, start 03) > We saw a strange behavior. > The number of znodes grew up without stopping (normally we had 4000 or 5000, > we got to 60,000 and then we stopped our application) > In zookeeeper logs I saw this (on leader only, one every minute) > 2016-07-04 14:53:50,302 [myid:7] - ERROR > [ContainerManagerTask:ContainerManager$1@84] - Error checking containers > java.lang.NullPointerException >at > org.apache.zookeeper.server.ContainerManager.getCandidates(ContainerManager.java:151) >at > org.apache.zookeeper.server.ContainerManager.checkContainers(ContainerManager.java:111) >at > org.apache.zookeeper.server.ContainerManager$1.run(ContainerManager.java:78) >at java.util.TimerThread.mainLoop(Timer.java:555) >at java.util.TimerThread.run(Timer.java:505) > We have not yet deleted the data ... so the problem can be reproduced on our > servers -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (ZOOKEEPER-2464) NullPointerException on ContainerManager
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15841000#comment-15841000 ] Jordan Zimmerman commented on ZOOKEEPER-2464: - [~eribeiro] - I think a 1 line change is too much for a test > NullPointerException on ContainerManager > > > Key: ZOOKEEPER-2464 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2464 > Project: ZooKeeper > Issue Type: Bug > Components: server >Affects Versions: 3.5.1 >Reporter: Stefano Salmaso >Assignee: Jordan Zimmerman > Fix For: 3.5.3, 3.6.0 > > Attachments: ContainerManagerTest.java, ZOOKEEPER-2464.patch > > > I would like to expose you to a problem that we are experiencing. > We are using a cluster of 7 zookeeper and we use them to implement a > distributed lock using Curator > (http://curator.apache.org/curator-recipes/shared-reentrant-lock.html) > So .. we tried to play with the servers to see if everything worked properly > and we stopped and start servers to see that the system worked well > (like stop 03, stop 05, stop 06, start 05, start 06, start 03) > We saw a strange behavior. > The number of znodes grew up without stopping (normally we had 4000 or 5000, > we got to 60,000 and then we stopped our application) > In zookeeeper logs I saw this (on leader only, one every minute) > 2016-07-04 14:53:50,302 [myid:7] - ERROR > [ContainerManagerTask:ContainerManager$1@84] - Error checking containers > java.lang.NullPointerException >at > org.apache.zookeeper.server.ContainerManager.getCandidates(ContainerManager.java:151) >at > org.apache.zookeeper.server.ContainerManager.checkContainers(ContainerManager.java:111) >at > org.apache.zookeeper.server.ContainerManager$1.run(ContainerManager.java:78) >at java.util.TimerThread.mainLoop(Timer.java:555) >at java.util.TimerThread.run(Timer.java:505) > We have not yet deleted the data ... so the problem can be reproduced on our > servers -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (ZOOKEEPER-2125) SSL on Netty client-server communication
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15840986#comment-15840986 ] Shivam commented on ZOOKEEPER-2125: --- Can this fix be back ported to last stable release 3.4.9 ?? > SSL on Netty client-server communication > > > Key: ZOOKEEPER-2125 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2125 > Project: ZooKeeper > Issue Type: Sub-task >Reporter: Hongchao Deng >Assignee: Hongchao Deng > Fix For: 3.5.1, 3.6.0 > > Attachments: testKeyStore.jks, testTrustStore.jks, > ZOOKEEPER-2125-build.patch, ZOOKEEPER-2125.patch, ZOOKEEPER-2125.patch, > ZOOKEEPER-2125.patch, ZOOKEEPER-2125.patch, ZOOKEEPER-2125.patch, > ZOOKEEPER-2125.patch, ZOOKEEPER-2125.patch, ZOOKEEPER-2125.patch, > ZOOKEEPER-2125.patch, ZOOKEEPER-2125.patch, ZOOKEEPER-2125.patch, > ZOOKEEPER-2125.patch, ZOOKEEPER-2125.patch, ZOOKEEPER-2125.patch, > ZOOKEEPER-2125.patch, ZOOKEEPER-2125.patch, ZOOKEEPER-2125.patch, > ZOOKEEPER-2125.patch > > > Supporting SSL on Netty client-server communication. > 1. It supports keystore and trustore usage. > 2. It adds an additional ZK server port which supports SSL. This would be > useful for rolling upgrade. > RB: https://reviews.apache.org/r/31277/ > The patch includes three files: > * testing purpose keystore and truststore under > "$(ZK_REPO_HOME)/src/java/test/data/ssl". Might need to create "ssl/". > * latest ZOOKEEPER-2125.patch > h2. How to use it > You need to set some parameters on both ZK server and client. > h3. Server > You need to specify a listening SSL port in "zoo.cfg": > {code} > secureClientPort=2281 > {code} > Just like what you did with "clientPort". And then set some jvm flags: > {code} > export > SERVER_JVMFLAGS="-Dzookeeper.serverCnxnFactory=org.apache.zookeeper.server.NettyServerCnxnFactory > -Dzookeeper.ssl.keyStore.location=/root/zookeeper/ssl/testKeyStore.jks > -Dzookeeper.ssl.keyStore.password=testpass > -Dzookeeper.ssl.trustStore.location=/root/zookeeper/ssl/testTrustStore.jks > -Dzookeeper.ssl.trustStore.password=testpass" > {code} > Please change keystore and truststore parameters accordingly. > h3. Client > You need to set jvm flags: > {code} > export > CLIENT_JVMFLAGS="-Dzookeeper.clientCnxnSocket=org.apache.zookeeper.ClientCnxnSocketNetty > -Dzookeeper.client.secure=true > -Dzookeeper.ssl.keyStore.location=/root/zookeeper/ssl/testKeyStore.jks > -Dzookeeper.ssl.keyStore.password=testpass > -Dzookeeper.ssl.trustStore.location=/root/zookeeper/ssl/testTrustStore.jks > -Dzookeeper.ssl.trustStore.password=testpass" > {code} > change keystore and truststore parameters accordingly. > And then connect to the server's SSL port, in this case: > {code} > bin/zkCli.sh -server 127.0.0.1:2281 > {code} > If you have any feedback, you are more than welcome to discuss it here! -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (ZOOKEEPER-2659) Use log4j2 as a logging framework as log4j 1.X is now deprecated
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2659?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15840802#comment-15840802 ] ASF GitHub Bot commented on ZOOKEEPER-2659: --- Github user praste commented on the issue: https://github.com/apache/zookeeper/pull/148 Are sure you removed reference to `slf4j-log4j12` from all the `ivy.xml` files? I am not an ivy expert but you can take a look at http://stackoverflow.com/questions/5405310/find-hidden-dependencies-in-ivy and http://ant.apache.org/ivy/history/latest-milestone/use/dependencytree.html > Use log4j2 as a logging framework as log4j 1.X is now deprecated > > > Key: ZOOKEEPER-2659 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2659 > Project: ZooKeeper > Issue Type: Wish >Reporter: Pushkar Raste >Assignee: Pushkar Raste >Priority: Minor > Attachments: zk_log4j2_migration.patch > > > Zookeeper currently uses {{log4j 1.X}} as the default logging framework. > {{log4j 1.X}} is now deprecated http://logging.apache.org/log4j/1.2/ > This ticket is to track efforts to move zookeeper to {{log4j2}} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] zookeeper issue #148: ZOOKEEPER-2659 Log4j 2 migration
Github user praste commented on the issue: https://github.com/apache/zookeeper/pull/148 Are sure you removed reference to `slf4j-log4j12` from all the `ivy.xml` files? I am not an ivy expert but you can take a look at http://stackoverflow.com/questions/5405310/find-hidden-dependencies-in-ivy and http://ant.apache.org/ivy/history/latest-milestone/use/dependencytree.html --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
Failed: ZOOKEEPER- PreCommit Build #256
Build: https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/256/ ### ## LAST 60 LINES OF THE CONSOLE ### [...truncated 25 lines...] Getting sizes Done: 9 Compressing objects Done: 0 Writing objects Done: 10 remote: Updating references Merging refs/tags/changes/256 > git rev-parse refs/tags/changes/256^{commit} # timeout=10 > git merge 831e560a9396f021b9d77f2127b4a294d7cc8638 # timeout=10 > git rev-parse branch-3.4^{commit} # timeout=10 Checking out Revision 74d5f228bc28391195e242b99f5c63f77ac12080 (branch-3.4) > git config core.sparsecheckout # timeout=10 > git checkout -f 74d5f228bc28391195e242b99f5c63f77ac12080 > git rev-parse origin/branch-3.4^{commit} # timeout=10 > git rev-list d6bbfd76d24c044073764c5d074a9198c69fafab # timeout=10 No emails were triggered. Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7 [PreCommit-ZOOKEEPER-github-pr-build] $ /bin/bash /tmp/hudson9036623794741036736.sh /home/jenkins/tools/java/latest1.7/bin/java java version "1.7.0_80" Java(TM) SE Runtime Environment (build 1.7.0_80-b15) Java HotSpot(TM) 64-Bit Server VM (build 24.80-b11, mixed mode) core file size (blocks, -c) 0 data seg size (kbytes, -d) unlimited scheduling priority (-e) 0 file size (blocks, -f) unlimited pending signals (-i) 386177 max locked memory (kbytes, -l) 64 max memory size (kbytes, -m) unlimited open files (-n) 6 pipe size(512 bytes, -p) 8 POSIX message queues (bytes, -q) 819200 real-time priority (-r) 0 stack size (kbytes, -s) 8192 cpu time (seconds, -t) unlimited max user processes (-u) 10240 virtual memory (kbytes, -v) unlimited file locks (-x) unlimited Buildfile: /home/jenkins/jenkins-slave/workspace/PreCommit-ZOOKEEPER-github-pr-build/build.xml BUILD FAILED Target "qa-test-pullrequest" does not exist in the project "ZooKeeper". Total time: 0 seconds Build step 'Execute shell' marked build as failure Archiving artifacts Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7 Recording test results Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7 ERROR: Step ‘Publish JUnit test result report’ failed: No test report files were found. Configuration error? [description-setter] Could not determine description. Putting comment on the pull request Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7 Email was triggered for: Failure - Any Sending email for trigger: Failure - Any Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7 Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7 Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7 ### ## FAILED TESTS (if any) ## No tests ran.
[jira] [Commented] (ZOOKEEPER-2044) CancelledKeyException in zookeeper 3.4.5
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15840707#comment-15840707 ] ASF GitHub Bot commented on ZOOKEEPER-2044: --- Github user hanm commented on a diff in the pull request: https://github.com/apache/zookeeper/pull/156#discussion_r98120457 --- Diff: src/java/test/org/apache/zookeeper/server/NIOServerCnxnTest.java --- @@ -68,4 +69,38 @@ public void testOperationsAfterCnxnClose() throws IOException, zk.close(); } } + +/** + * Mock extension of NIOServerCnxn to test for + * CancelledKeyException (ZOOKEEPER-2044). + */ +private static class MockNIOServerCnxn extends NIOServerCnxn { +public MockNIOServerCnxn(NIOServerCnxn cnxn) +throws IOException { +super(cnxn.zkServer, cnxn.sock, cnxn.sk, cnxn.factory); +} + +public void mockSendBuffer(ByteBuffer bb) throws Exception { +super.internalSendBuffer(bb); +} +} + +@Test(timeout = 3) +public void testValidSelectionKey() throws Exception { +final ZooKeeper zk = createClient(); +try { +Iterable connections = serverFactory.getConnections(); +for (ServerCnxn serverCnxn : connections) { +MockNIOServerCnxn mock = new MockNIOServerCnxn((NIOServerCnxn) serverCnxn); +// Cancel key +((NIOServerCnxn) serverCnxn).sock.keyFor(((NIOServerCnxnFactory) serverFactory).selector).cancel();; +mock.mockSendBuffer(ByteBuffer.allocate(8)); +} +} catch (CancelledKeyException e) { +LOG.error("Exception while sending bytes!", e); +Assert.fail(e.toString()); +} finally { +zk.close(); --- End diff -- @rakeshadr Good observation on the long running of the test. This is definitely something we should fix. The actual delay indeed happens at client close and the root cause is session timeout: when a client closing itself it sends a request to server, and this request packet will stuck forever in our case because server has canceled the selector; so client session will expire eventually and by default, the timeout value between client / server is set as 30 sec and 2/3 about it - which is 20 sec is exactly what it would cost for a heart beat to fail. I fixed this by adjusting the timeout value to 3 sec instead just for this single test. PTAL. > CancelledKeyException in zookeeper 3.4.5 > > > Key: ZOOKEEPER-2044 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2044 > Project: ZooKeeper > Issue Type: Bug > Components: server >Affects Versions: 3.4.6 > Environment: Red Hat Enterprise Linux Server release 6.2 >Reporter: shamjith antholi >Assignee: Flavio Junqueira >Priority: Minor > Fix For: 3.4.10 > > Attachments: ZOOKEEPER-2044.patch, ZOOKEEPER-2044.patch > > > I am getting cancelled key exception in zookeeper (version 3.4.5). Please see > the log below. When this error is thrown, the connected solr shard is going > down by giving the error "Failed to index metadata in > Solr,StackTrace=SolrError: HTTP status 503.Reason: > {"responseHeader":{"status":503,"QTime":204},"error":{"msg":"ClusterState > says we are the leader, but locally we don't think so","code":503" and > ultimately the current activity is going down. Could you please give a > solution for this ? > Zookeper log > -- > 2014-09-16 02:58:47,799 [myid:1] - INFO > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:ZooKeeperServer@832] - Client > attempting to renew session 0x24868e7ca980003 at /172.22.0.5:58587 > 2014-09-16 02:58:47,800 [myid:1] - INFO > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:Learner@107] - Revalidating > client: 0x24868e7ca980003 > 2014-09-16 02:58:47,802 [myid:1] - INFO > [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:ZooKeeperServer@588] - Invalid > session 0x24868e7ca980003 for client /172.22.0.5:58587, probably expired > 2014-09-16 02:58:47,803 [myid:1] - INFO > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1001] - Closed > socket connection for client /172.22.0.5:58587 which had sessionid > 0x24868e7ca980003 > 2014-09-16 02:58:47,810 [myid:1] - ERROR > [CommitProcessor:1:NIOServerCnxn@180] - Unexpected Exception: > java.nio.channels.CancelledKeyException > at sun.nio.ch.SelectionKeyImpl.ensureValid(SelectionKeyImpl.java:55) > at sun.nio.ch.SelectionKeyImpl.interestOps(SelectionKeyImpl.java:59) > at >
[GitHub] zookeeper pull request #156: ZOOKEEPER-2044:CancelledKeyException in zookeep...
Github user hanm commented on a diff in the pull request: https://github.com/apache/zookeeper/pull/156#discussion_r98120457 --- Diff: src/java/test/org/apache/zookeeper/server/NIOServerCnxnTest.java --- @@ -68,4 +69,38 @@ public void testOperationsAfterCnxnClose() throws IOException, zk.close(); } } + +/** + * Mock extension of NIOServerCnxn to test for + * CancelledKeyException (ZOOKEEPER-2044). + */ +private static class MockNIOServerCnxn extends NIOServerCnxn { +public MockNIOServerCnxn(NIOServerCnxn cnxn) +throws IOException { +super(cnxn.zkServer, cnxn.sock, cnxn.sk, cnxn.factory); +} + +public void mockSendBuffer(ByteBuffer bb) throws Exception { +super.internalSendBuffer(bb); +} +} + +@Test(timeout = 3) +public void testValidSelectionKey() throws Exception { +final ZooKeeper zk = createClient(); +try { +Iterable connections = serverFactory.getConnections(); +for (ServerCnxn serverCnxn : connections) { +MockNIOServerCnxn mock = new MockNIOServerCnxn((NIOServerCnxn) serverCnxn); +// Cancel key +((NIOServerCnxn) serverCnxn).sock.keyFor(((NIOServerCnxnFactory) serverFactory).selector).cancel();; +mock.mockSendBuffer(ByteBuffer.allocate(8)); +} +} catch (CancelledKeyException e) { +LOG.error("Exception while sending bytes!", e); +Assert.fail(e.toString()); +} finally { +zk.close(); --- End diff -- @rakeshadr Good observation on the long running of the test. This is definitely something we should fix. The actual delay indeed happens at client close and the root cause is session timeout: when a client closing itself it sends a request to server, and this request packet will stuck forever in our case because server has canceled the selector; so client session will expire eventually and by default, the timeout value between client / server is set as 30 sec and 2/3 about it - which is 20 sec is exactly what it would cost for a heart beat to fail. I fixed this by adjusting the timeout value to 3 sec instead just for this single test. PTAL. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
Re: Paper
On Thu, Jan 26, 2017 at 1:14 PM, Bobby Evanswrote: > We did think about ram disks a little, but the plan still is to have the > source code, in one form or another, morph into a small distributed time > series database for the metrics. With that in mind we thought it would be > better to take a step in that direction. Yes a ram disk would likely have > provided similar performance. Although we would have still wanted to > separate out just the metrics to the ram disk backed ZK, because we store > other more critical data in ZK too, that need more durability guarantees. > > Indeed, that was my thinking as well - two ZKs, one with "ephemeral" data and one with precious. Your response makes sense/expected, I only ask because I didn't see it mentioned in the document and for many folks it's a good, if perhaps short term, solution. Regards, Patrick > - Bobby > > > On Thursday, January 26, 2017, 10:36:01 AM CST, Patrick Hunt < > ph...@apache.org> wrote: > Very interesting results and real world insights. Thanks for > creating/sharing. > > One thing I noticed is that you mentioned considering SSDs, had you also > considered using ram disks? I've seen some scenarios where that has been > very successful. > > Patrick > > On Thu, Jan 26, 2017 at 6:28 AM, Bobby Evans > wrote: > > > As one of the authors of pacemaker in Apache Storm (and the paper), I am > > happy to answer any questions about why we did it or how it works. The > > reality of it is storm was, and still is by default, abusing zookeeper by > > trying to store a massive amount of metrics in it, instead of the > > configuration/coordination it was designed for. And since storm metrics > > don't really need strong consistency or even that much in terms of > > reliability guarantees we stood up a netty server in front of a > > ConcurrentHashMap (quite literately) and then wrote a client that could > > handle fail-over. > > It really is meant as a scalability stepping stone until we can get to > the > > point that all the metrics go to a TSDB that is actually designed for > > metrics. But like I said if you have any questions I am happy to answer > > them. > > Sadly because of the way IEEE works neither I nor my employer own the > copy > > right to that paper any more so I can't even put a copy of it up for you > to > > read. > > > > > > - Bobby > > > > On Thursday, January 26, 2017, 6:44:56 AM CST, ibrahim El-sanosi < > > ibrahimsaba...@gmail.com> wrote:Hi folk, > > > > There is a paper published recently "PaceMaker: When ZooKeeper Arteries > Get > > Clogged in Storm Clusters" [1]. It may worth to read. > > > > [1] > > http://ieeexplore.ieee.org/document/7820303/?tp=; > > arnumber=7820303=Conference%20Publications= > > eWFob28uY29t=SEARCHALERT > > > > Ibrahim > > >
Re: Paper
We did think about ram disks a little, but the plan still is to have the source code, in one form or another, morph into a small distributed time series database for the metrics. With that in mind we thought it would be better to take a step in that direction. Yes a ram disk would likely have provided similar performance. Although we would have still wanted to separate out just the metrics to the ram disk backed ZK, because we store other more critical data in ZK too, that need more durability guarantees. - Bobby On Thursday, January 26, 2017, 10:36:01 AM CST, Patrick Huntwrote:Very interesting results and real world insights. Thanks for creating/sharing. One thing I noticed is that you mentioned considering SSDs, had you also considered using ram disks? I've seen some scenarios where that has been very successful. Patrick On Thu, Jan 26, 2017 at 6:28 AM, Bobby Evans wrote: > As one of the authors of pacemaker in Apache Storm (and the paper), I am > happy to answer any questions about why we did it or how it works. The > reality of it is storm was, and still is by default, abusing zookeeper by > trying to store a massive amount of metrics in it, instead of the > configuration/coordination it was designed for. And since storm metrics > don't really need strong consistency or even that much in terms of > reliability guarantees we stood up a netty server in front of a > ConcurrentHashMap (quite literately) and then wrote a client that could > handle fail-over. > It really is meant as a scalability stepping stone until we can get to the > point that all the metrics go to a TSDB that is actually designed for > metrics. But like I said if you have any questions I am happy to answer > them. > Sadly because of the way IEEE works neither I nor my employer own the copy > right to that paper any more so I can't even put a copy of it up for you to > read. > > > - Bobby > > On Thursday, January 26, 2017, 6:44:56 AM CST, ibrahim El-sanosi < > ibrahimsaba...@gmail.com> wrote:Hi folk, > > There is a paper published recently "PaceMaker: When ZooKeeper Arteries Get > Clogged in Storm Clusters" [1]. It may worth to read. > > [1] > http://ieeexplore.ieee.org/document/7820303/?tp=; > arnumber=7820303=Conference%20Publications= > eWFob28uY29t=SEARCHALERT > > Ibrahim >
[jira] [Commented] (ZOOKEEPER-2464) NullPointerException on ContainerManager
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15840464#comment-15840464 ] Edward Ribeiro commented on ZOOKEEPER-2464: --- Yeah... makes sense, it's pretty inconsistent behavior. It would require some defensive code as {{DataNode.setChildren(null)}} could introduce the null again. In fact, it is a total refactoring of {{DataNode}}, albeit a small class. Wdyt [~randgalt]? > NullPointerException on ContainerManager > > > Key: ZOOKEEPER-2464 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2464 > Project: ZooKeeper > Issue Type: Bug > Components: server >Affects Versions: 3.5.1 >Reporter: Stefano Salmaso >Assignee: Jordan Zimmerman > Fix For: 3.5.3, 3.6.0 > > Attachments: ContainerManagerTest.java, ZOOKEEPER-2464.patch > > > I would like to expose you to a problem that we are experiencing. > We are using a cluster of 7 zookeeper and we use them to implement a > distributed lock using Curator > (http://curator.apache.org/curator-recipes/shared-reentrant-lock.html) > So .. we tried to play with the servers to see if everything worked properly > and we stopped and start servers to see that the system worked well > (like stop 03, stop 05, stop 06, start 05, start 06, start 03) > We saw a strange behavior. > The number of znodes grew up without stopping (normally we had 4000 or 5000, > we got to 60,000 and then we stopped our application) > In zookeeeper logs I saw this (on leader only, one every minute) > 2016-07-04 14:53:50,302 [myid:7] - ERROR > [ContainerManagerTask:ContainerManager$1@84] - Error checking containers > java.lang.NullPointerException >at > org.apache.zookeeper.server.ContainerManager.getCandidates(ContainerManager.java:151) >at > org.apache.zookeeper.server.ContainerManager.checkContainers(ContainerManager.java:111) >at > org.apache.zookeeper.server.ContainerManager$1.run(ContainerManager.java:78) >at java.util.TimerThread.mainLoop(Timer.java:555) >at java.util.TimerThread.run(Timer.java:505) > We have not yet deleted the data ... so the problem can be reproduced on our > servers -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (ZOOKEEPER-2464) NullPointerException on ContainerManager
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15840421#comment-15840421 ] Mohammad Arshad commented on ZOOKEEPER-2464: Root cause of the problem is the inconsistent behavior of DataNode.getChildren() API. DataNode.getChildren() API Current Behaviour: # returns null initially When DataNode is created and no children are added yet, DataNode.getChildren() returns null # returns empty set after all the children are deleted: created a Node add a child delete the child DataNode.getChildren() returns empty set. I think we should fix this issue by modifying the DataNode.getChildren() API. We should always return empty set if there is no child. > NullPointerException on ContainerManager > > > Key: ZOOKEEPER-2464 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2464 > Project: ZooKeeper > Issue Type: Bug > Components: server >Affects Versions: 3.5.1 >Reporter: Stefano Salmaso >Assignee: Jordan Zimmerman > Fix For: 3.5.3, 3.6.0 > > Attachments: ContainerManagerTest.java, ZOOKEEPER-2464.patch > > > I would like to expose you to a problem that we are experiencing. > We are using a cluster of 7 zookeeper and we use them to implement a > distributed lock using Curator > (http://curator.apache.org/curator-recipes/shared-reentrant-lock.html) > So .. we tried to play with the servers to see if everything worked properly > and we stopped and start servers to see that the system worked well > (like stop 03, stop 05, stop 06, start 05, start 06, start 03) > We saw a strange behavior. > The number of znodes grew up without stopping (normally we had 4000 or 5000, > we got to 60,000 and then we stopped our application) > In zookeeeper logs I saw this (on leader only, one every minute) > 2016-07-04 14:53:50,302 [myid:7] - ERROR > [ContainerManagerTask:ContainerManager$1@84] - Error checking containers > java.lang.NullPointerException >at > org.apache.zookeeper.server.ContainerManager.getCandidates(ContainerManager.java:151) >at > org.apache.zookeeper.server.ContainerManager.checkContainers(ContainerManager.java:111) >at > org.apache.zookeeper.server.ContainerManager$1.run(ContainerManager.java:78) >at java.util.TimerThread.mainLoop(Timer.java:555) >at java.util.TimerThread.run(Timer.java:505) > We have not yet deleted the data ... so the problem can be reproduced on our > servers -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (ZOOKEEPER-2659) Use log4j2 as a logging framework as log4j 1.X is now deprecated
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2659?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15840226#comment-15840226 ] ASF GitHub Bot commented on ZOOKEEPER-2659: --- Github user nerdyyatrice commented on the issue: https://github.com/apache/zookeeper/pull/148 Hi, I tried the same approach and I got a bin place conflict as some of the dependencies in my ivy.xml is still using log4j1.2. I wonder how can I find out which dependency bin placed that or is that a way to override that? > Use log4j2 as a logging framework as log4j 1.X is now deprecated > > > Key: ZOOKEEPER-2659 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2659 > Project: ZooKeeper > Issue Type: Wish >Reporter: Pushkar Raste >Assignee: Pushkar Raste >Priority: Minor > Attachments: zk_log4j2_migration.patch > > > Zookeeper currently uses {{log4j 1.X}} as the default logging framework. > {{log4j 1.X}} is now deprecated http://logging.apache.org/log4j/1.2/ > This ticket is to track efforts to move zookeeper to {{log4j2}} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] zookeeper issue #148: ZOOKEEPER-2659 Log4j 2 migration
Github user nerdyyatrice commented on the issue: https://github.com/apache/zookeeper/pull/148 Hi, I tried the same approach and I got a bin place conflict as some of the dependencies in my ivy.xml is still using log4j1.2. I wonder how can I find out which dependency bin placed that or is that a way to override that? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (ZOOKEEPER-2659) Use log4j2 as a logging framework as log4j 1.X is now deprecated
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2659?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15840125#comment-15840125 ] Pushkar Raste commented on ZOOKEEPER-2659: -- I am not a committer. Can someone take a look. > Use log4j2 as a logging framework as log4j 1.X is now deprecated > > > Key: ZOOKEEPER-2659 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2659 > Project: ZooKeeper > Issue Type: Wish >Reporter: Pushkar Raste >Assignee: Pushkar Raste >Priority: Minor > Attachments: zk_log4j2_migration.patch > > > Zookeeper currently uses {{log4j 1.X}} as the default logging framework. > {{log4j 1.X}} is now deprecated http://logging.apache.org/log4j/1.2/ > This ticket is to track efforts to move zookeeper to {{log4j2}} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
ZooKeeper_branch35_solaris - Build # 411 - Still Failing
See https://builds.apache.org/job/ZooKeeper_branch35_solaris/411/ ### ## LAST 60 LINES OF THE CONSOLE ### [...truncated 468102 lines...] [junit] 2017-01-26 17:17:16,769 [myid:] - INFO [main:ClientBase@386] - CREATING server instance 127.0.0.1:11222 [junit] 2017-01-26 17:17:16,769 [myid:] - INFO [main:NIOServerCnxnFactory@673] - Configuring NIO connection handler with 10s sessionless connection timeout, 2 selector thread(s), 16 worker threads, and 64 kB direct buffers. [junit] 2017-01-26 17:17:16,770 [myid:] - INFO [main:NIOServerCnxnFactory@686] - binding to port 0.0.0.0/0.0.0.0:11222 [junit] 2017-01-26 17:17:16,771 [myid:] - INFO [main:ClientBase@361] - STARTING server instance 127.0.0.1:11222 [junit] 2017-01-26 17:17:16,771 [myid:] - INFO [main:ZooKeeperServer@893] - minSessionTimeout set to 6000 [junit] 2017-01-26 17:17:16,771 [myid:] - INFO [main:ZooKeeperServer@902] - maxSessionTimeout set to 6 [junit] 2017-01-26 17:17:16,771 [myid:] - INFO [main:ZooKeeperServer@159] - Created server with tickTime 3000 minSessionTimeout 6000 maxSessionTimeout 6 datadir /zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/ZooKeeper_branch35_solaris/build/test/tmp/test5698106632771417782.junit.dir/version-2 snapdir /zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/ZooKeeper_branch35_solaris/build/test/tmp/test5698106632771417782.junit.dir/version-2 [junit] 2017-01-26 17:17:16,772 [myid:] - INFO [main:FileSnap@83] - Reading snapshot /zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/ZooKeeper_branch35_solaris/build/test/tmp/test5698106632771417782.junit.dir/version-2/snapshot.b [junit] 2017-01-26 17:17:16,774 [myid:] - INFO [main:FileTxnSnapLog@320] - Snapshotting: 0xb to /zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/ZooKeeper_branch35_solaris/build/test/tmp/test5698106632771417782.junit.dir/version-2/snapshot.b [junit] 2017-01-26 17:17:16,775 [myid:] - ERROR [main:ZooKeeperServer@505] - ZKShutdownHandler is not registered, so ZooKeeper server won't take any action on ERROR or SHUTDOWN server state changes [junit] 2017-01-26 17:17:16,775 [myid:] - INFO [main:FourLetterWordMain@85] - connecting to 127.0.0.1 11222 [junit] 2017-01-26 17:17:16,776 [myid:] - INFO [NIOServerCxnFactory.AcceptThread:0.0.0.0/0.0.0.0:11222:NIOServerCnxnFactory$AcceptThread@296] - Accepted socket connection from /127.0.0.1:53056 [junit] 2017-01-26 17:17:16,777 [myid:] - INFO [NIOWorkerThread-1:NIOServerCnxn@485] - Processing stat command from /127.0.0.1:53056 [junit] 2017-01-26 17:17:16,777 [myid:] - INFO [NIOWorkerThread-1:StatCommand@49] - Stat command output [junit] 2017-01-26 17:17:16,777 [myid:] - INFO [NIOWorkerThread-1:NIOServerCnxn@614] - Closed socket connection for client /127.0.0.1:53056 (no session established for client) [junit] 2017-01-26 17:17:16,778 [myid:] - INFO [main:JMXEnv@228] - ensureParent:[InMemoryDataTree, StandaloneServer_port] [junit] 2017-01-26 17:17:16,779 [myid:] - INFO [main:JMXEnv@245] - expect:InMemoryDataTree [junit] 2017-01-26 17:17:16,779 [myid:] - INFO [main:JMXEnv@249] - found:InMemoryDataTree org.apache.ZooKeeperService:name0=StandaloneServer_port11222,name1=InMemoryDataTree [junit] 2017-01-26 17:17:16,779 [myid:] - INFO [main:JMXEnv@245] - expect:StandaloneServer_port [junit] 2017-01-26 17:17:16,779 [myid:] - INFO [main:JMXEnv@249] - found:StandaloneServer_port org.apache.ZooKeeperService:name0=StandaloneServer_port11222 [junit] 2017-01-26 17:17:16,780 [myid:] - INFO [main:JUnit4ZKTestRunner$LoggedInvokeMethod@82] - Memory used 17885 [junit] 2017-01-26 17:17:16,780 [myid:] - INFO [main:JUnit4ZKTestRunner$LoggedInvokeMethod@87] - Number of threads 24 [junit] 2017-01-26 17:17:16,780 [myid:] - INFO [main:JUnit4ZKTestRunner$LoggedInvokeMethod@102] - FINISHED TEST METHOD testQuota [junit] 2017-01-26 17:17:16,780 [myid:] - INFO [main:ClientBase@543] - tearDown starting [junit] 2017-01-26 17:17:16,852 [myid:] - INFO [main:ZooKeeper@1322] - Session: 0x12658c14419 closed [junit] 2017-01-26 17:17:16,852 [myid:] - INFO [main-EventThread:ClientCnxn$EventThread@513] - EventThread shut down for session: 0x12658c14419 [junit] 2017-01-26 17:17:16,852 [myid:] - INFO [main:ClientBase@513] - STOPPING server [junit] 2017-01-26 17:17:16,852 [myid:] - INFO [ConnnectionExpirer:NIOServerCnxnFactory$ConnectionExpirerThread@583] - ConnnectionExpirerThread interrupted [junit] 2017-01-26 17:17:16,853 [myid:] - INFO [NIOServerCxnFactory.SelectorThread-0:NIOServerCnxnFactory$SelectorThread@420] - selector thread exitted run method [junit] 2017-01-26 17:17:16,852 [myid:] - INFO
[jira] [Commented] (ZOOKEEPER-2659) Use log4j2 as a logging framework as log4j 1.X is now deprecated
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2659?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15840018#comment-15840018 ] ASF GitHub Bot commented on ZOOKEEPER-2659: --- Github user jvz commented on a diff in the pull request: https://github.com/apache/zookeeper/pull/148#discussion_r98037086 --- Diff: src/java/test/org/apache/zookeeper/server/quorum/QuorumPeerMainTest.java --- @@ -413,13 +418,18 @@ public void testBadPeerAddressInQuorum() throws Exception { ClientBase.setupTestEnv(); // setup the logger to capture all logs +LoggerContext loggerContext = (LoggerContext) LogManager.getContext(false); --- End diff -- Oh sorry, I meant to get back to you on this much sooner. You can merge without using it; just thought it would be a less hacky test. > Use log4j2 as a logging framework as log4j 1.X is now deprecated > > > Key: ZOOKEEPER-2659 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2659 > Project: ZooKeeper > Issue Type: Wish >Reporter: Pushkar Raste >Assignee: Pushkar Raste >Priority: Minor > Attachments: zk_log4j2_migration.patch > > > Zookeeper currently uses {{log4j 1.X}} as the default logging framework. > {{log4j 1.X}} is now deprecated http://logging.apache.org/log4j/1.2/ > This ticket is to track efforts to move zookeeper to {{log4j2}} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] zookeeper pull request #148: ZOOKEEPER-2659 Log4j 2 migration
Github user jvz commented on a diff in the pull request: https://github.com/apache/zookeeper/pull/148#discussion_r98037086 --- Diff: src/java/test/org/apache/zookeeper/server/quorum/QuorumPeerMainTest.java --- @@ -413,13 +418,18 @@ public void testBadPeerAddressInQuorum() throws Exception { ClientBase.setupTestEnv(); // setup the logger to capture all logs +LoggerContext loggerContext = (LoggerContext) LogManager.getContext(false); --- End diff -- Oh sorry, I meant to get back to you on this much sooner. You can merge without using it; just thought it would be a less hacky test. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
Re: Paper
Very interesting results and real world insights. Thanks for creating/sharing. One thing I noticed is that you mentioned considering SSDs, had you also considered using ram disks? I've seen some scenarios where that has been very successful. Patrick On Thu, Jan 26, 2017 at 6:28 AM, Bobby Evanswrote: > As one of the authors of pacemaker in Apache Storm (and the paper), I am > happy to answer any questions about why we did it or how it works. The > reality of it is storm was, and still is by default, abusing zookeeper by > trying to store a massive amount of metrics in it, instead of the > configuration/coordination it was designed for. And since storm metrics > don't really need strong consistency or even that much in terms of > reliability guarantees we stood up a netty server in front of a > ConcurrentHashMap (quite literately) and then wrote a client that could > handle fail-over. > It really is meant as a scalability stepping stone until we can get to the > point that all the metrics go to a TSDB that is actually designed for > metrics. But like I said if you have any questions I am happy to answer > them. > Sadly because of the way IEEE works neither I nor my employer own the copy > right to that paper any more so I can't even put a copy of it up for you to > read. > > > - Bobby > > On Thursday, January 26, 2017, 6:44:56 AM CST, ibrahim El-sanosi < > ibrahimsaba...@gmail.com> wrote:Hi folk, > > There is a paper published recently "PaceMaker: When ZooKeeper Arteries Get > Clogged in Storm Clusters" [1]. It may worth to read. > > [1] > http://ieeexplore.ieee.org/document/7820303/?tp=; > arnumber=7820303=Conference%20Publications= > eWFob28uY29t=SEARCHALERT > > Ibrahim >
Failed: ZOOKEEPER- PreCommit Build #255
Build: https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/255/ ### ## LAST 60 LINES OF THE CONSOLE ### [...truncated 25 lines...] Getting sizes Done: 18 Compressing objects Done: 0 Writing objects Done: 19 remote: Updating references Merging refs/tags/changes/255 > git rev-parse refs/tags/changes/255^{commit} # timeout=10 > git merge 5aa25620e0189b28d7040305272be2fda28126fb # timeout=10 > git rev-parse branch-3.4^{commit} # timeout=10 Checking out Revision 5aa25620e0189b28d7040305272be2fda28126fb (branch-3.4) > git config core.sparsecheckout # timeout=10 > git checkout -f 5aa25620e0189b28d7040305272be2fda28126fb > git rev-parse origin/branch-3.4^{commit} # timeout=10 > git rev-list d6bbfd76d24c044073764c5d074a9198c69fafab # timeout=10 No emails were triggered. Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7 [PreCommit-ZOOKEEPER-github-pr-build] $ /bin/bash /tmp/hudson9068826468290050932.sh /home/jenkins/tools/java/latest1.7/bin/java java version "1.7.0_80" Java(TM) SE Runtime Environment (build 1.7.0_80-b15) Java HotSpot(TM) 64-Bit Server VM (build 24.80-b11, mixed mode) core file size (blocks, -c) 0 data seg size (kbytes, -d) unlimited scheduling priority (-e) 0 file size (blocks, -f) unlimited pending signals (-i) 386177 max locked memory (kbytes, -l) 64 max memory size (kbytes, -m) unlimited open files (-n) 6 pipe size(512 bytes, -p) 8 POSIX message queues (bytes, -q) 819200 real-time priority (-r) 0 stack size (kbytes, -s) 8192 cpu time (seconds, -t) unlimited max user processes (-u) 10240 virtual memory (kbytes, -v) unlimited file locks (-x) unlimited Buildfile: /home/jenkins/jenkins-slave/workspace/PreCommit-ZOOKEEPER-github-pr-build/build.xml BUILD FAILED Target "qa-test-pullrequest" does not exist in the project "ZooKeeper". Total time: 0 seconds Build step 'Execute shell' marked build as failure Archiving artifacts Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7 Recording test results Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7 ERROR: Step ‘Publish JUnit test result report’ failed: No test report files were found. Configuration error? [description-setter] Could not determine description. Putting comment on the pull request Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7 Email was triggered for: Failure - Any Sending email for trigger: Failure - Any Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7 Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7 Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7 ### ## FAILED TESTS (if any) ## No tests ran.
[GitHub] zookeeper pull request #157: ZOOKEEPER-2678: Discovery and Sync can take a v...
GitHub user revans2 opened a pull request: https://github.com/apache/zookeeper/pull/157 ZOOKEEPER-2678: Discovery and Sync can take a very long time on large DB This patch addresses recovery time when a leader is lost on a large DB. It does this by not clearing the DB before leader election begins, and by avoiding taking a snapshot as part of the SYNC phase, specifically for a DIFF sync. It does this by buffering the proposals and commits just like the code currently does for proposals/commits sent after the NEWLEADER and before the UPTODATE messages. If a SNAP is sent we cannot avoid writing out the full snapshot because there is no other way to make sure the disk DB is in sync with what is in memory. So any edits to the edit log before a background snapshot happened could possibly be applied on top of an incorrect snapshot. This same optimization should work for TRUNC too, but I opted not to do it for TRUNC because TRUNC is rare and TRUNC by its very nature already forces the DB to be reread after the edit logs are modified. So it would still not be fast. In practice this makes it so instead of taking 5+ mins for the cluster to recover from losing a leader it now takes about 3 seconds. I am happy to port this to 3.5. if it looks good. You can merge this pull request into a Git repository by running: $ git pull https://github.com/revans2/zookeeper ZOOKEEPER-2678 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/zookeeper/pull/157.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #157 commit 5aa25620e0189b28d7040305272be2fda28126fb Author: Robert (Bobby) EvansDate: 2017-01-19T19:50:32Z ZOOKEEPER-2678: Discovery and Sync can take a very long time on large DBs --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (ZOOKEEPER-2678) Large databases take a long time to regain a quorum
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15839860#comment-15839860 ] ASF GitHub Bot commented on ZOOKEEPER-2678: --- GitHub user revans2 opened a pull request: https://github.com/apache/zookeeper/pull/157 ZOOKEEPER-2678: Discovery and Sync can take a very long time on large DB This patch addresses recovery time when a leader is lost on a large DB. It does this by not clearing the DB before leader election begins, and by avoiding taking a snapshot as part of the SYNC phase, specifically for a DIFF sync. It does this by buffering the proposals and commits just like the code currently does for proposals/commits sent after the NEWLEADER and before the UPTODATE messages. If a SNAP is sent we cannot avoid writing out the full snapshot because there is no other way to make sure the disk DB is in sync with what is in memory. So any edits to the edit log before a background snapshot happened could possibly be applied on top of an incorrect snapshot. This same optimization should work for TRUNC too, but I opted not to do it for TRUNC because TRUNC is rare and TRUNC by its very nature already forces the DB to be reread after the edit logs are modified. So it would still not be fast. In practice this makes it so instead of taking 5+ mins for the cluster to recover from losing a leader it now takes about 3 seconds. I am happy to port this to 3.5. if it looks good. You can merge this pull request into a Git repository by running: $ git pull https://github.com/revans2/zookeeper ZOOKEEPER-2678 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/zookeeper/pull/157.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #157 commit 5aa25620e0189b28d7040305272be2fda28126fb Author: Robert (Bobby) EvansDate: 2017-01-19T19:50:32Z ZOOKEEPER-2678: Discovery and Sync can take a very long time on large DBs > Large databases take a long time to regain a quorum > --- > > Key: ZOOKEEPER-2678 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2678 > Project: ZooKeeper > Issue Type: Bug > Components: server >Affects Versions: 3.4.9, 3.5.2 >Reporter: Robert Joseph Evans >Assignee: Robert Joseph Evans > > I know this is long but please here me out. > I recently inherited a massive zookeeper ensemble. The snapshot is 3.4 GB on > disk. Because of its massive size we have been running into a number of > issues. There are lots of problems that we hope to fix with tuning GC etc, > but the big one right now that is blocking us making a lot of progress on the > rest of them is that when we lose a quorum because the leader left, for what > ever reason, it can take well over 5 mins for a new quorum to be established. > So we cannot tune the leader without risking downtime. > We traced down where the time was being spent and found that each server was > clearing the database so it would be read back in again before leader > election even started. Then as part of the sync phase each server will write > out a snapshot to checkpoint the progress it made as part of the sync. > I will be putting up a patch shortly with some proposed changes in it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (ZOOKEEPER-2659) Use log4j2 as a logging framework as log4j 1.X is now deprecated
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2659?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15839840#comment-15839840 ] ASF GitHub Bot commented on ZOOKEEPER-2659: --- Github user praste commented on a diff in the pull request: https://github.com/apache/zookeeper/pull/148#discussion_r98013864 --- Diff: src/java/test/org/apache/zookeeper/server/quorum/QuorumPeerMainTest.java --- @@ -413,13 +418,18 @@ public void testBadPeerAddressInQuorum() throws Exception { ClientBase.setupTestEnv(); // setup the logger to capture all logs +LoggerContext loggerContext = (LoggerContext) LogManager.getContext(false); --- End diff -- @jvz is using ListAppender absolutely necessary? Can we merge this with current changes? > Use log4j2 as a logging framework as log4j 1.X is now deprecated > > > Key: ZOOKEEPER-2659 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2659 > Project: ZooKeeper > Issue Type: Wish >Reporter: Pushkar Raste >Assignee: Pushkar Raste >Priority: Minor > Attachments: zk_log4j2_migration.patch > > > Zookeeper currently uses {{log4j 1.X}} as the default logging framework. > {{log4j 1.X}} is now deprecated http://logging.apache.org/log4j/1.2/ > This ticket is to track efforts to move zookeeper to {{log4j2}} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (ZOOKEEPER-2678) Large databases take a long time to regain a quorum
Robert Joseph Evans created ZOOKEEPER-2678: -- Summary: Large databases take a long time to regain a quorum Key: ZOOKEEPER-2678 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2678 Project: ZooKeeper Issue Type: Bug Components: server Affects Versions: 3.5.2, 3.4.9 Reporter: Robert Joseph Evans Assignee: Robert Joseph Evans I know this is long but please here me out. I recently inherited a massive zookeeper ensemble. The snapshot is 3.4 GB on disk. Because of its massive size we have been running into a number of issues. There are lots of problems that we hope to fix with tuning GC etc, but the big one right now that is blocking us making a lot of progress on the rest of them is that when we lose a quorum because the leader left, for what ever reason, it can take well over 5 mins for a new quorum to be established. So we cannot tune the leader without risking downtime. We traced down where the time was being spent and found that each server was clearing the database so it would be read back in again before leader election even started. Then as part of the sync phase each server will write out a snapshot to checkpoint the progress it made as part of the sync. I will be putting up a patch shortly with some proposed changes in it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] zookeeper pull request #148: ZOOKEEPER-2659 Log4j 2 migration
Github user praste commented on a diff in the pull request: https://github.com/apache/zookeeper/pull/148#discussion_r98013864 --- Diff: src/java/test/org/apache/zookeeper/server/quorum/QuorumPeerMainTest.java --- @@ -413,13 +418,18 @@ public void testBadPeerAddressInQuorum() throws Exception { ClientBase.setupTestEnv(); // setup the logger to capture all logs +LoggerContext loggerContext = (LoggerContext) LogManager.getContext(false); --- End diff -- @jvz is using ListAppender absolutely necessary? Can we merge this with current changes? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
Re: Paper
You can already find the paper on sci-hub.io (search for the DOI I)Dsci On Thu, Jan 26, 2017 at 3:17 PM, Jordan Zimmerman < jor...@jordanzimmerman.com> wrote: > Sad that such an important paper requires a fee. Is there a free version > anywhere? > > -Jordan > > > On Jan 26, 2017, at 7:44 AM, ibrahim El-sanosi> wrote: > > > > Hi folk, > > > > There is a paper published recently "PaceMaker: When ZooKeeper Arteries > Get > > Clogged in Storm Clusters" [1]. It may worth to read. > > > > [1] > > http://ieeexplore.ieee.org/document/7820303/?tp=; > arnumber=7820303=Conference%20Publications= > eWFob28uY29t=SEARCHALERT > > > > Ibrahim > > -- Paul Asmuth T: +31-622-351956 p...@asmuth.com EventQL | DeepCortex GmbH https://eventql.io/ Kantstraße 33 10625 Berlin
Re: Paper
As one of the authors of pacemaker in Apache Storm (and the paper), I am happy to answer any questions about why we did it or how it works. The reality of it is storm was, and still is by default, abusing zookeeper by trying to store a massive amount of metrics in it, instead of the configuration/coordination it was designed for. And since storm metrics don't really need strong consistency or even that much in terms of reliability guarantees we stood up a netty server in front of a ConcurrentHashMap (quite literately) and then wrote a client that could handle fail-over. It really is meant as a scalability stepping stone until we can get to the point that all the metrics go to a TSDB that is actually designed for metrics. But like I said if you have any questions I am happy to answer them. Sadly because of the way IEEE works neither I nor my employer own the copy right to that paper any more so I can't even put a copy of it up for you to read. - Bobby On Thursday, January 26, 2017, 6:44:56 AM CST, ibrahim El-sanosiwrote:Hi folk, There is a paper published recently "PaceMaker: When ZooKeeper Arteries Get Clogged in Storm Clusters" [1]. It may worth to read. [1] http://ieeexplore.ieee.org/document/7820303/?tp==7820303=Conference%20Publications=eWFob28uY29t=SEARCHALERT Ibrahim
Re: Paper
Sad that such an important paper requires a fee. Is there a free version anywhere? -Jordan > On Jan 26, 2017, at 7:44 AM, ibrahim El-sanosi> wrote: > > Hi folk, > > There is a paper published recently "PaceMaker: When ZooKeeper Arteries Get > Clogged in Storm Clusters" [1]. It may worth to read. > > [1] > http://ieeexplore.ieee.org/document/7820303/?tp==7820303=Conference%20Publications=eWFob28uY29t=SEARCHALERT > > Ibrahim
ZooKeeper_branch34_solaris - Build # 1446 - Still Failing
See https://builds.apache.org/job/ZooKeeper_branch34_solaris/1446/ ### ## LAST 60 LINES OF THE CONSOLE ### [...truncated 197435 lines...] [junit] 2017-01-26 13:54:46,272 [myid:] - INFO [main:ZooKeeperServer@497] - shutting down [junit] 2017-01-26 13:54:46,272 [myid:] - ERROR [main:ZooKeeperServer@472] - ZKShutdownHandler is not registered, so ZooKeeper server won't take any action on ERROR or SHUTDOWN server state changes [junit] 2017-01-26 13:54:46,272 [myid:] - INFO [main:SessionTrackerImpl@225] - Shutting down [junit] 2017-01-26 13:54:46,272 [myid:] - INFO [main:PrepRequestProcessor@765] - Shutting down [junit] 2017-01-26 13:54:46,272 [myid:] - INFO [main:SyncRequestProcessor@208] - Shutting down [junit] 2017-01-26 13:54:46,272 [myid:] - INFO [ProcessThread(sid:0 cport:11221)::PrepRequestProcessor@143] - PrepRequestProcessor exited loop! [junit] 2017-01-26 13:54:46,272 [myid:] - INFO [SyncThread:0:SyncRequestProcessor@186] - SyncRequestProcessor exited! [junit] 2017-01-26 13:54:46,272 [myid:] - INFO [main:FinalRequestProcessor@402] - shutdown of request processor complete [junit] 2017-01-26 13:54:46,273 [myid:] - INFO [main:FourLetterWordMain@62] - connecting to 127.0.0.1 11221 [junit] 2017-01-26 13:54:46,273 [myid:] - INFO [main:JMXEnv@147] - ensureOnly:[] [junit] 2017-01-26 13:54:46,274 [myid:] - INFO [main:ClientBase@445] - STARTING server [junit] 2017-01-26 13:54:46,274 [myid:] - INFO [main:ClientBase@366] - CREATING server instance 127.0.0.1:11221 [junit] 2017-01-26 13:54:46,275 [myid:] - INFO [main:NIOServerCnxnFactory@89] - binding to port 0.0.0.0/0.0.0.0:11221 [junit] 2017-01-26 13:54:46,275 [myid:] - INFO [main:ClientBase@341] - STARTING server instance 127.0.0.1:11221 [junit] 2017-01-26 13:54:46,275 [myid:] - INFO [main:ZooKeeperServer@173] - Created server with tickTime 3000 minSessionTimeout 6000 maxSessionTimeout 6 datadir /zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/ZooKeeper_branch34_solaris/build/test/tmp/test5802278521389583743.junit.dir/version-2 snapdir /zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/ZooKeeper_branch34_solaris/build/test/tmp/test5802278521389583743.junit.dir/version-2 [junit] 2017-01-26 13:54:46,278 [myid:] - ERROR [main:ZooKeeperServer@472] - ZKShutdownHandler is not registered, so ZooKeeper server won't take any action on ERROR or SHUTDOWN server state changes [junit] 2017-01-26 13:54:46,278 [myid:] - INFO [main:FourLetterWordMain@62] - connecting to 127.0.0.1 11221 [junit] 2017-01-26 13:54:46,278 [myid:] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:11221:NIOServerCnxnFactory@192] - Accepted socket connection from /127.0.0.1:34218 [junit] 2017-01-26 13:54:46,279 [myid:] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:11221:NIOServerCnxn@827] - Processing stat command from /127.0.0.1:34218 [junit] 2017-01-26 13:54:46,279 [myid:] - INFO [Thread-5:NIOServerCnxn$StatCommand@663] - Stat command output [junit] 2017-01-26 13:54:46,279 [myid:] - INFO [Thread-5:NIOServerCnxn@1008] - Closed socket connection for client /127.0.0.1:34218 (no session established for client) [junit] 2017-01-26 13:54:46,279 [myid:] - INFO [main:JMXEnv@230] - ensureParent:[InMemoryDataTree, StandaloneServer_port] [junit] 2017-01-26 13:54:46,280 [myid:] - INFO [main:JMXEnv@247] - expect:InMemoryDataTree [junit] 2017-01-26 13:54:46,280 [myid:] - INFO [main:JMXEnv@251] - found:InMemoryDataTree org.apache.ZooKeeperService:name0=StandaloneServer_port11221,name1=InMemoryDataTree [junit] 2017-01-26 13:54:46,281 [myid:] - INFO [main:JMXEnv@247] - expect:StandaloneServer_port [junit] 2017-01-26 13:54:46,281 [myid:] - INFO [main:JMXEnv@251] - found:StandaloneServer_port org.apache.ZooKeeperService:name0=StandaloneServer_port11221 [junit] 2017-01-26 13:54:46,281 [myid:] - INFO [main:JUnit4ZKTestRunner$LoggedInvokeMethod@58] - Memory used 8996 [junit] 2017-01-26 13:54:46,281 [myid:] - INFO [main:JUnit4ZKTestRunner$LoggedInvokeMethod@63] - Number of threads 20 [junit] 2017-01-26 13:54:46,281 [myid:] - INFO [main:JUnit4ZKTestRunner$LoggedInvokeMethod@78] - FINISHED TEST METHOD testQuota [junit] 2017-01-26 13:54:46,281 [myid:] - INFO [main:ClientBase@522] - tearDown starting [junit] 2017-01-26 13:54:46,362 [myid:] - INFO [main:ZooKeeper@684] - Session: 0x159db0e97a5 closed [junit] 2017-01-26 13:54:46,362 [myid:] - INFO [main-EventThread:ClientCnxn$EventThread@519] - EventThread shut down for session: 0x159db0e97a5 [junit] 2017-01-26 13:54:46,362 [myid:] - INFO [main:ClientBase@492] - STOPPING server [junit] 2017-01-26 13:54:46,363 [myid:] - INFO [main:ZooKeeperServer@497] - shutting down [junit] 2017-01-26 13:54:46,363 [myid:] -
Paper
Hi folk, There is a paper published recently "PaceMaker: When ZooKeeper Arteries Get Clogged in Storm Clusters" [1]. It may worth to read. [1] http://ieeexplore.ieee.org/document/7820303/?tp==7820303=Conference%20Publications=eWFob28uY29t=SEARCHALERT Ibrahim
ZooKeeper_branch35_jdk8 - Build # 392 - Still Failing
See https://builds.apache.org/job/ZooKeeper_branch35_jdk8/392/ ### ## LAST 60 LINES OF THE CONSOLE ### [...truncated 437582 lines...] [junit] at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:357) [junit] at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1214) [junit] 2017-01-26 12:15:57,545 [myid:127.0.0.1:11348] - INFO [main-SendThread(127.0.0.1:11348):ClientCnxn$SendThread@1113] - Opening socket connection to server 127.0.0.1/127.0.0.1:11348. Will not attempt to authenticate using SASL (unknown error) [junit] 2017-01-26 12:15:57,545 [myid:127.0.0.1:11348] - WARN [main-SendThread(127.0.0.1:11348):ClientCnxn$SendThread@1235] - Session 0x202057b70ce for server 127.0.0.1/127.0.0.1:11348, unexpected error, closing socket connection and attempting reconnect [junit] java.net.ConnectException: Connection refused [junit] at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) [junit] at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717) [junit] at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:357) [junit] at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1214) [junit] 2017-01-26 12:15:57,650 [myid:127.0.0.1:11345] - INFO [main-SendThread(127.0.0.1:11345):ClientCnxn$SendThread@1113] - Opening socket connection to server 127.0.0.1/127.0.0.1:11345. Will not attempt to authenticate using SASL (unknown error) [junit] 2017-01-26 12:15:57,651 [myid:127.0.0.1:11345] - WARN [main-SendThread(127.0.0.1:11345):ClientCnxn$SendThread@1235] - Session 0x102057b70c9 for server 127.0.0.1/127.0.0.1:11345, unexpected error, closing socket connection and attempting reconnect [junit] java.net.ConnectException: Connection refused [junit] at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) [junit] at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717) [junit] at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:357) [junit] at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1214) [junit] 2017-01-26 12:15:57,853 [myid:127.0.0.1:11222] - INFO [main-SendThread(127.0.0.1:11222):ClientCnxn$SendThread@1113] - Opening socket connection to server 127.0.0.1/127.0.0.1:11222. Will not attempt to authenticate using SASL (unknown error) [junit] 2017-01-26 12:15:57,853 [myid:127.0.0.1:11222] - WARN [main-SendThread(127.0.0.1:11222):ClientCnxn$SendThread@1235] - Session 0x10205779eaf for server 127.0.0.1/127.0.0.1:11222, unexpected error, closing socket connection and attempting reconnect [junit] java.net.ConnectException: Connection refused [junit] at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) [junit] at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717) [junit] at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:357) [junit] at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1214) [junit] 2017-01-26 12:15:58,077 [myid:] - INFO [ProcessThread(sid:0 cport:11468)::PrepRequestProcessor@656] - Processed session termination for sessionid: 0x102057ecd12 [junit] 2017-01-26 12:15:58,090 [myid:] - INFO [main:ZooKeeper@1322] - Session: 0x102057ecd12 closed [junit] 2017-01-26 12:15:58,090 [myid:] - INFO [main-EventThread:ClientCnxn$EventThread@513] - EventThread shut down for session: 0x102057ecd12 [junit] 2017-01-26 12:15:58,090 [myid:] - INFO [main:JUnit4ZKTestRunner$LoggedInvokeMethod@82] - Memory used 214901 [junit] 2017-01-26 12:15:58,091 [myid:] - INFO [main:JUnit4ZKTestRunner$LoggedInvokeMethod@87] - Number of threads 2427 [junit] 2017-01-26 12:15:58,091 [myid:] - INFO [main:JUnit4ZKTestRunner$LoggedInvokeMethod@102] - FINISHED TEST METHOD testWatcherAutoResetWithLocal [junit] 2017-01-26 12:15:58,091 [myid:] - INFO [main:ClientBase@543] - tearDown starting [junit] 2017-01-26 12:15:58,091 [myid:] - INFO [main:ClientBase@513] - STOPPING server [junit] 2017-01-26 12:15:58,091 [myid:] - INFO [main:NettyServerCnxnFactory@464] - shutdown called 0.0.0.0/0.0.0.0:11468 [junit] 2017-01-26 12:15:58,090 [myid:] - INFO [SyncThread:0:MBeanRegistry@128] - Unregister MBean [org.apache.ZooKeeperService:name0=StandaloneServer_port11468,name1=Connections,name2=127.0.0.1,name3=0x102057ecd12] [junit] 2017-01-26 12:15:58,100 [myid:] - INFO [main:ZooKeeperServer@533] - shutting down [junit] 2017-01-26 12:15:58,101 [myid:] - ERROR [main:ZooKeeperServer@505] - ZKShutdownHandler is not registered, so ZooKeeper server won't take any action on ERROR or SHUTDOWN server state changes [junit] 2017-01-26
Re: ZooKeeper 3.4.10 release discussion
Hi, Rakesh and Flavio, what do you think about merging ZOOKEEPER-2622 to branch-3.4 and include it in 3.4.10 besides branch-3.5 and master? Edward On Thu, Jan 26, 2017 at 8:20 AM, Flavio Junqueirawrote: > Here are a few comments on the proposal of changes to the release process: > > - It might be a better idea to preserve the HowToRelease document for > future reference, clone the document, and change the cloned document to > reflect the git commands rather than svn. > - We still need to modify Step 2 to be git oriented, otherwise it will > look odd that we have svn there. > - In Step 4, I thought that we had informally agreed to rely on the git > log rather than maintain the CHANGES.txt file. If we aren't all onboard > with the idea of stopping to use CHANGES.txt, then we need to discuss this > separately. > - Steps 5 and 6: I'm not sure why the steps to produce the release notes > changes. We still resolve issues on jira which is pretty much the source of > data for the release notes. > - Step 10: I personally don't like using "git commit -a" unless you're > pretty sure that it is what you want. A much safer approach is to run "git > status" and "git add" to the individual files/directories. > - Step 11: Why are we tagging with -s? Is that standard practice in other > projects? > > -Flavio > > > On 26 Jan 2017, at 03:30, Rakesh Radhakrishnan > wrote: > > > > Agreed, will try to resolve ZK-2184. I have included this to 3.4.10 > > releasing. I could see few open review comments in the PR, probably will > > push once this is concluded. > > > > Thanks, > > Rakesh > > > > On Thu, Jan 26, 2017 at 2:01 AM, Flavio Junqueira > wrote: > > > >> I'd like to have ZK-2184 in as well. I have seen many cases in which > >> applications are affected by that problem. If folks can help me push it > >> through, I'd appreciate. > >> > >> -Flavio > >> > >>> On 25 Jan 2017, at 17:01, Rakesh Radhakrishnan > >> wrote: > >>> > >>> I've reviewed ZOOKEEPER-2044 pull request and added few comments. I > hope > >>> this will be committed soon. > >>> > >>> I'm planning to keep the CHANGE.txt file for this release. But, not > >>> updating the commit history considering that git revision can be used > as > >> a > >>> reference. Please see my comment https://goo.gl/wu5V2M in > ZOOKEEPER-2672 > >>> jira. > >>> > >>> Sometime back, I've filtered the issues which was marked for 3.4.10 and > >>> moved out these to 3.4.11 release. > >>> > >>> Thanks, > >>> Rakesh > >>> > >>> On Wed, Jan 25, 2017 at 5:41 AM, Michael Han > wrote: > >>> > Hi Rakesh, > > Thanks for driving 3.4.10 release. > > I've been looking at https://issues.apache.org/ > >> jira/browse/ZOOKEEPER-2044 > today I think this could be a good addition to 3.4.10 release - what > do > >> you > think? Should we get this in 3.4.10? > > > On Tue, Jan 24, 2017 at 9:13 AM, Rakesh Radhakrishnan < > >> rake...@apache.org> > wrote: > > > Hi folks, > > > > ZOOKEEPER-2573 fix is agreed and will be resolved soon. After > >> committing > > this jira, I'm planning to start cutting a release candidate based on > >> my > > proposed "HowToRelease" ZK cwiki changes. > > > > Appreciate feedback on proposed ZK cwiki https://cwiki.apache.org/ > > confluence/display/ZOOKEEPER/HowToRelease changes. Please refer my > > previous > > mail to understand more about it. > > > > Thanks, > > Rakesh > > > > On Tue, Jan 17, 2017 at 12:11 PM, Rakesh Radhakrishnan < > rake...@apache.org > >> > > wrote: > > > >> OK. I have modified ZK cwiki page https://cwiki.apache.org/ > >> confluence/display/ZOOKEEPER/HowToRelease directly. Please review > the > > newly > >> added lines in orange color to understand the changes. The following > >> sections has been modified: > >> > >> - *Updating the release branch -> modified steps **1, 4, 10, 11* > >> - *Building -> modified step 9* > >> - *Publishing -> modified step 1* > >> > >> Thanks, > >> Rakesh > >> > >> On Tue, Jan 17, 2017 at 11:36 AM, Patrick Hunt > wrote: > >> > >>> Perhaps you can make the changes directly on the wiki page as a > > duplicate > >>> line item under the original in a different color? It's hard for me > >> to > >>> really follow, esp as it's not a 1:1 replacement iiuc. Could you > try > >>> editing the wiki directly to start with, leave the original line > and > add > >>> the new line(s) but in another color or some other indication? > >>> > >>> Thanks Rakesh. > >>> > >>> Patrick > >>> > >>> On Mon, Jan 16, 2017 at 8:48 AM, Rakesh Radhakrishnan < > > rake...@apache.org > > >>> wrote: > >>> > Hi folks, > > As we all know, 3.4.10 release
[jira] [Commented] (ZOOKEEPER-1416) Persistent Recursive Watch
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15839573#comment-15839573 ] Henrik Nordvik commented on ZOOKEEPER-1416: --- When I read ZOOKEEPER-153 it looks to me that the arguments they have against persistent watches is that 1. it's not suitable for clients requiring getting every change (we use it as a cache, so we don't require every change), and 2. it doesn't provide a performance benefit when watching a single node since you need to get the data anyways, and you can set the watch again at the same time. However, this changes when you watch a tree of nodes. With a persistent recursive watch you don't need one watch per child znode, which reduces the amount of book keeping that both the client and the server has to do. > Persistent Recursive Watch > -- > > Key: ZOOKEEPER-1416 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1416 > Project: ZooKeeper > Issue Type: Improvement > Components: c client, documentation, java client, server >Reporter: Phillip Liu >Assignee: Jordan Zimmerman > Attachments: ZOOKEEPER-1416.patch, ZOOKEEPER-1416.patch > > Original Estimate: 504h > Remaining Estimate: 504h > > h4. The Problem > A ZooKeeper Watch can be placed on a single znode and when the znode changes > a Watch event is sent to the client. If there are thousands of znodes being > watched, when a client (re)connect, it would have to send thousands of watch > requests. At Facebook, we have this problem storing information for thousands > of db shards. Consequently a naming service that consumes the db shard > definition issues thousands of watch requests each time the service starts > and changes client watcher. > h4. Proposed Solution > We add the notion of a Persistent Recursive Watch in ZooKeeper. Persistent > means no Watch reset is necessary after a watch-fire. Recursive means the > Watch applies to the node and descendant nodes. A Persistent Recursive Watch > behaves as follows: > # Recursive Watch supports all Watch semantics: CHILDREN, DATA, and EXISTS. > # CHILDREN and DATA Recursive Watches can be placed on any znode. > # EXISTS Recursive Watches can be placed on any path. > # A Recursive Watch behaves like a auto-watch registrar on the server side. > Setting a Recursive Watch means to set watches on all descendant znodes. > # When a watch on a descendant fires, no subsequent event is fired until a > corresponding getData(..) on the znode is called, then Recursive Watch > automically apply the watch on the znode. This maintains the existing Watch > semantic on an individual znode. > # A Recursive Watch overrides any watches placed on a descendant znode. > Practically this means the Recursive Watch Watcher callback is the one > receiving the event and event is delivered exactly once. > A goal here is to reduce the number of semantic changes. The guarantee of no > intermediate watch event until data is read will be maintained. The only > difference is we will automatically re-add the watch after read. At the same > time we add the convience of reducing the need to add multiple watches for > sibling znodes and in turn reduce the number of watch messages sent from the > client to the server. > There are some implementation details that needs to be hashed out. Initial > thinking is to have the Recursive Watch create per-node watches. This will > cause a lot of watches to be created on the server side. Currently, each > watch is stored as a single bit in a bit set relative to a session - up to 3 > bits per client per znode. If there are 100m znodes with 100k clients, each > watching all nodes, then this strategy will consume approximately 3.75TB of > ram distributed across all Observers. Seems expensive. > Alternatively, a blacklist of paths to not send Watches regardless of Watch > setting can be set each time a watch event from a Recursive Watch is fired. > The memory utilization is relative to the number of outstanding reads and at > worst case it's 1/3 * 3.75TB using the parameters given above. > Otherwise, a relaxation of no intermediate watch event until read guarantee > is required. If the server can send watch events regardless of one has > already been fired without corresponding read, then the server can simply > fire watch events without tracking. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (ZOOKEEPER-2672) Remove CHANGE.txt
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15839537#comment-15839537 ] Flavio Junqueira commented on ZOOKEEPER-2672: - I'm not aware on any dependency on CHANGES.txt, so I'm +1 for removing it. According to the project bylaws, I'd say that this change corresponds to a change to the code base, and as such, the vote is by lazy approval, switching to lazy majority in the case of at least one -1, where the binding votes are from active committers. > Remove CHANGE.txt > - > > Key: ZOOKEEPER-2672 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2672 > Project: ZooKeeper > Issue Type: Improvement > Components: build >Affects Versions: 3.4.9, 3.5.2 >Reporter: Michael Han >Assignee: Michael Han > > The CHANGE.txt is already not the source of truth of what's changed after we > migrating to git - most of the git commits in recent couple of months don't > update CHANGE.txt. The option of updating CHANGE.txt during commit flow > automatically is none trivial, and do that manually is cumbersome and error > prone. > The consensus is we would rely on source control revision logs instead of > CHANGE.txt moving forward; see > https://www.mail-archive.com/dev@zookeeper.apache.org/msg37108.html for more > details. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: ZooKeeper 3.4.10 release discussion
Here are a few comments on the proposal of changes to the release process: - It might be a better idea to preserve the HowToRelease document for future reference, clone the document, and change the cloned document to reflect the git commands rather than svn. - We still need to modify Step 2 to be git oriented, otherwise it will look odd that we have svn there. - In Step 4, I thought that we had informally agreed to rely on the git log rather than maintain the CHANGES.txt file. If we aren't all onboard with the idea of stopping to use CHANGES.txt, then we need to discuss this separately. - Steps 5 and 6: I'm not sure why the steps to produce the release notes changes. We still resolve issues on jira which is pretty much the source of data for the release notes. - Step 10: I personally don't like using "git commit -a" unless you're pretty sure that it is what you want. A much safer approach is to run "git status" and "git add" to the individual files/directories. - Step 11: Why are we tagging with -s? Is that standard practice in other projects? -Flavio > On 26 Jan 2017, at 03:30, Rakesh Radhakrishnanwrote: > > Agreed, will try to resolve ZK-2184. I have included this to 3.4.10 > releasing. I could see few open review comments in the PR, probably will > push once this is concluded. > > Thanks, > Rakesh > > On Thu, Jan 26, 2017 at 2:01 AM, Flavio Junqueira wrote: > >> I'd like to have ZK-2184 in as well. I have seen many cases in which >> applications are affected by that problem. If folks can help me push it >> through, I'd appreciate. >> >> -Flavio >> >>> On 25 Jan 2017, at 17:01, Rakesh Radhakrishnan >> wrote: >>> >>> I've reviewed ZOOKEEPER-2044 pull request and added few comments. I hope >>> this will be committed soon. >>> >>> I'm planning to keep the CHANGE.txt file for this release. But, not >>> updating the commit history considering that git revision can be used as >> a >>> reference. Please see my comment https://goo.gl/wu5V2M in ZOOKEEPER-2672 >>> jira. >>> >>> Sometime back, I've filtered the issues which was marked for 3.4.10 and >>> moved out these to 3.4.11 release. >>> >>> Thanks, >>> Rakesh >>> >>> On Wed, Jan 25, 2017 at 5:41 AM, Michael Han wrote: >>> Hi Rakesh, Thanks for driving 3.4.10 release. I've been looking at https://issues.apache.org/ >> jira/browse/ZOOKEEPER-2044 today I think this could be a good addition to 3.4.10 release - what do >> you think? Should we get this in 3.4.10? On Tue, Jan 24, 2017 at 9:13 AM, Rakesh Radhakrishnan < >> rake...@apache.org> wrote: > Hi folks, > > ZOOKEEPER-2573 fix is agreed and will be resolved soon. After >> committing > this jira, I'm planning to start cutting a release candidate based on >> my > proposed "HowToRelease" ZK cwiki changes. > > Appreciate feedback on proposed ZK cwiki https://cwiki.apache.org/ > confluence/display/ZOOKEEPER/HowToRelease changes. Please refer my > previous > mail to understand more about it. > > Thanks, > Rakesh > > On Tue, Jan 17, 2017 at 12:11 PM, Rakesh Radhakrishnan < rake...@apache.org >> > wrote: > >> OK. I have modified ZK cwiki page https://cwiki.apache.org/ >> confluence/display/ZOOKEEPER/HowToRelease directly. Please review the > newly >> added lines in orange color to understand the changes. The following >> sections has been modified: >> >> - *Updating the release branch -> modified steps **1, 4, 10, 11* >> - *Building -> modified step 9* >> - *Publishing -> modified step 1* >> >> Thanks, >> Rakesh >> >> On Tue, Jan 17, 2017 at 11:36 AM, Patrick Hunt wrote: >> >>> Perhaps you can make the changes directly on the wiki page as a > duplicate >>> line item under the original in a different color? It's hard for me >> to >>> really follow, esp as it's not a 1:1 replacement iiuc. Could you try >>> editing the wiki directly to start with, leave the original line and add >>> the new line(s) but in another color or some other indication? >>> >>> Thanks Rakesh. >>> >>> Patrick >>> >>> On Mon, Jan 16, 2017 at 8:48 AM, Rakesh Radhakrishnan < > rake...@apache.org >>> wrote: >>> Hi folks, As we all know, 3.4.10 release is the first ZooKeeper release after > the github repository migration. I have tried an attempt to modify the > steps described in the ' https://cwiki.apache.org/confluence/display/ZOOKEEPER/HowToRelease' >>> page to make the release. Since this release is from an already created > branch, >>> I have focused only the branch related parts in cwiki and below sections >>> in
[jira] [Commented] (ZOOKEEPER-2395) allow ant command line control of junit test jvm args
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15839502#comment-15839502 ] Edward Ribeiro commented on ZOOKEEPER-2395: --- oops, *excuse me* [~hanm] for skipping this comment on ZK-2664, shame on me. :( Agree with you. > allow ant command line control of junit test jvm args > - > > Key: ZOOKEEPER-2395 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2395 > Project: ZooKeeper > Issue Type: Improvement > Components: build, tests >Reporter: Patrick Hunt >Assignee: Patrick Hunt > Fix For: 3.5.3, 3.6.0, 3.4.11 > > > We're seeing some failing jobs (see below) and the speculation is that it > might be due to ipv6 vs ipv4 usage. It would be nice to turn on "prefer ipv4" > in the jvm but there is no easy way to do that. I'll propose that we add a > variable to ant that's passed through to the jvm. > > This is very odd. It failed 2 of the last three times it was run on H9 > with the following: > 2016-03-20 06:06:18,480 [myid:] - INFO > [main:JUnit4ZKTestRunner$LoggedInvokeMethod@74] - TEST METHOD FAILED > testBindByAddress > java.net.SocketException: No such device > at java.net.NetworkInterface.isLoopback0(Native Method) > at java.net.NetworkInterface.isLoopback(NetworkInterface.java:339) > at > org.apache.zookeeper.test.ClientPortBindTest.testBindByAddress(ClientPortBindTest.java:61) > https://builds.apache.org/job/ZooKeeper_branch34/buildTimeTrend > Why would it pass one of the times though if there is no loopback > device on the host? That seems very odd! -- This message was sent by Atlassian JIRA (v6.3.4#6332)
ZooKeeper-trunk-solaris - Build # 1480 - Still Failing
See https://builds.apache.org/job/ZooKeeper-trunk-solaris/1480/ ### ## LAST 60 LINES OF THE CONSOLE ### [...truncated 467006 lines...] [junit] 2017-01-26 08:23:11,378 [myid:] - INFO [main:ClientBase@401] - CREATING server instance 127.0.0.1:11222 [junit] 2017-01-26 08:23:11,378 [myid:] - INFO [main:NIOServerCnxnFactory@673] - Configuring NIO connection handler with 10s sessionless connection timeout, 2 selector thread(s), 16 worker threads, and 64 kB direct buffers. [junit] 2017-01-26 08:23:11,379 [myid:] - INFO [main:NIOServerCnxnFactory@686] - binding to port 0.0.0.0/0.0.0.0:11222 [junit] 2017-01-26 08:23:11,380 [myid:] - INFO [main:ClientBase@376] - STARTING server instance 127.0.0.1:11222 [junit] 2017-01-26 08:23:11,380 [myid:] - INFO [main:ZooKeeperServer@894] - minSessionTimeout set to 6000 [junit] 2017-01-26 08:23:11,380 [myid:] - INFO [main:ZooKeeperServer@903] - maxSessionTimeout set to 6 [junit] 2017-01-26 08:23:11,380 [myid:] - INFO [main:ZooKeeperServer@160] - Created server with tickTime 3000 minSessionTimeout 6000 maxSessionTimeout 6 datadir /zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/ZooKeeper-trunk-solaris/build/test/tmp/test2622701376991085976.junit.dir/version-2 snapdir /zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/ZooKeeper-trunk-solaris/build/test/tmp/test2622701376991085976.junit.dir/version-2 [junit] 2017-01-26 08:23:11,381 [myid:] - INFO [main:FileSnap@83] - Reading snapshot /zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/ZooKeeper-trunk-solaris/build/test/tmp/test2622701376991085976.junit.dir/version-2/snapshot.b [junit] 2017-01-26 08:23:11,383 [myid:] - INFO [main:FileTxnSnapLog@346] - Snapshotting: 0xb to /zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/ZooKeeper-trunk-solaris/build/test/tmp/test2622701376991085976.junit.dir/version-2/snapshot.b [junit] 2017-01-26 08:23:11,384 [myid:] - ERROR [main:ZooKeeperServer@506] - ZKShutdownHandler is not registered, so ZooKeeper server won't take any action on ERROR or SHUTDOWN server state changes [junit] 2017-01-26 08:23:11,384 [myid:] - INFO [main:FourLetterWordMain@85] - connecting to 127.0.0.1 11222 [junit] 2017-01-26 08:23:11,385 [myid:] - INFO [NIOServerCxnFactory.AcceptThread:0.0.0.0/0.0.0.0:11222:NIOServerCnxnFactory$AcceptThread@296] - Accepted socket connection from /127.0.0.1:58510 [junit] 2017-01-26 08:23:11,386 [myid:] - INFO [NIOWorkerThread-1:NIOServerCnxn@485] - Processing stat command from /127.0.0.1:58510 [junit] 2017-01-26 08:23:11,386 [myid:] - INFO [NIOWorkerThread-1:StatCommand@49] - Stat command output [junit] 2017-01-26 08:23:11,386 [myid:] - INFO [NIOWorkerThread-1:NIOServerCnxn@614] - Closed socket connection for client /127.0.0.1:58510 (no session established for client) [junit] 2017-01-26 08:23:11,386 [myid:] - INFO [main:JMXEnv@228] - ensureParent:[InMemoryDataTree, StandaloneServer_port] [junit] 2017-01-26 08:23:11,387 [myid:] - INFO [main:JMXEnv@245] - expect:InMemoryDataTree [junit] 2017-01-26 08:23:11,388 [myid:] - INFO [main:JMXEnv@249] - found:InMemoryDataTree org.apache.ZooKeeperService:name0=StandaloneServer_port11222,name1=InMemoryDataTree [junit] 2017-01-26 08:23:11,388 [myid:] - INFO [main:JMXEnv@245] - expect:StandaloneServer_port [junit] 2017-01-26 08:23:11,388 [myid:] - INFO [main:JMXEnv@249] - found:StandaloneServer_port org.apache.ZooKeeperService:name0=StandaloneServer_port11222 [junit] 2017-01-26 08:23:11,388 [myid:] - INFO [main:JUnit4ZKTestRunner$LoggedInvokeMethod@82] - Memory used 17907 [junit] 2017-01-26 08:23:11,388 [myid:] - INFO [main:JUnit4ZKTestRunner$LoggedInvokeMethod@87] - Number of threads 24 [junit] 2017-01-26 08:23:11,389 [myid:] - INFO [main:JUnit4ZKTestRunner$LoggedInvokeMethod@102] - FINISHED TEST METHOD testQuota [junit] 2017-01-26 08:23:11,389 [myid:] - INFO [main:ClientBase@558] - tearDown starting [junit] 2017-01-26 08:23:11,462 [myid:] - INFO [main:ZooKeeper@1324] - Session: 0x12656d84b56 closed [junit] 2017-01-26 08:23:11,462 [myid:] - INFO [main-EventThread:ClientCnxn$EventThread@513] - EventThread shut down for session: 0x12656d84b56 [junit] 2017-01-26 08:23:11,462 [myid:] - INFO [main:ClientBase@528] - STOPPING server [junit] 2017-01-26 08:23:11,463 [myid:] - INFO [ConnnectionExpirer:NIOServerCnxnFactory$ConnectionExpirerThread@583] - ConnnectionExpirerThread interrupted [junit] 2017-01-26 08:23:11,463 [myid:] - INFO [NIOServerCxnFactory.SelectorThread-0:NIOServerCnxnFactory$SelectorThread@420] - selector thread exitted run method [junit] 2017-01-26 08:23:11,463 [myid:] - INFO [NIOServerCxnFactory.SelectorThread-1:NIOServerCnxnFactory$SelectorThread@420] -