ZooKeeper-trunk-openjdk7 - Build # 909 - Failure
See https://builds.apache.org/job/ZooKeeper-trunk-openjdk7/909/ ### ## LAST 60 LINES OF THE CONSOLE ### [...truncated 390469 lines...] [junit] 2015-08-25 10:21:48,280 [myid:] - INFO [main:MBeanRegistry@119] - Unregister MBean [org.apache.ZooKeeperService:name0=StandaloneServer_port11222] [junit] 2015-08-25 10:21:48,280 [myid:] - INFO [main:FourLetterWordMain@84] - connecting to 127.0.0.1 11222 [junit] 2015-08-25 10:21:48,281 [myid:] - INFO [main:JMXEnv@142] - ensureOnly:[] [junit] 2015-08-25 10:21:48,283 [myid:] - INFO [main:ClientBase@460] - STARTING server [junit] 2015-08-25 10:21:48,283 [myid:] - INFO [main:ClientBase@380] - CREATING server instance 127.0.0.1:11222 [junit] 2015-08-25 10:21:48,283 [myid:] - INFO [main:NIOServerCnxnFactory@673] - Configuring NIO connection handler with 10s sessionless connection timeout, 2 selector thread(s), 32 worker threads, and 64 kB direct buffers. [junit] 2015-08-25 10:21:48,284 [myid:] - INFO [main:NIOServerCnxnFactory@686] - binding to port 0.0.0.0/0.0.0.0:11222 [junit] 2015-08-25 10:21:48,284 [myid:] - INFO [main:ClientBase@355] - STARTING server instance 127.0.0.1:11222 [junit] 2015-08-25 10:21:48,285 [myid:] - INFO [main:ZooKeeperServer@858] - minSessionTimeout set to 6000 [junit] 2015-08-25 10:21:48,285 [myid:] - INFO [main:ZooKeeperServer@867] - maxSessionTimeout set to 6 [junit] 2015-08-25 10:21:48,285 [myid:] - INFO [main:ZooKeeperServer@156] - Created server with tickTime 3000 minSessionTimeout 6000 maxSessionTimeout 6 datadir /home/jenkins/jenkins-slave/workspace/ZooKeeper-trunk-openjdk7/trunk/build/test/tmp/test2062310060461922516.junit.dir/version-2 snapdir /home/jenkins/jenkins-slave/workspace/ZooKeeper-trunk-openjdk7/trunk/build/test/tmp/test2062310060461922516.junit.dir/version-2 [junit] 2015-08-25 10:21:48,287 [myid:] - INFO [main:FileSnap@83] - Reading snapshot /home/jenkins/jenkins-slave/workspace/ZooKeeper-trunk-openjdk7/trunk/build/test/tmp/test2062310060461922516.junit.dir/version-2/snapshot.b [junit] 2015-08-25 10:21:48,290 [myid:] - INFO [main:FileTxnSnapLog@298] - Snapshotting: 0xb to /home/jenkins/jenkins-slave/workspace/ZooKeeper-trunk-openjdk7/trunk/build/test/tmp/test2062310060461922516.junit.dir/version-2/snapshot.b [junit] 2015-08-25 10:21:48,292 [myid:] - INFO [main:FourLetterWordMain@84] - connecting to 127.0.0.1 11222 [junit] 2015-08-25 10:21:48,293 [myid:] - INFO [NIOServerCxnFactory.AcceptThread:0.0.0.0/0.0.0.0:11222:NIOServerCnxnFactory$AcceptThread@296] - Accepted socket connection from /127.0.0.1:52710 [junit] 2015-08-25 10:21:48,295 [myid:] - INFO [NIOWorkerThread-1:NIOServerCnxn@485] - Processing stat command from /127.0.0.1:52710 [junit] 2015-08-25 10:21:48,295 [myid:] - INFO [NIOWorkerThread-1:StatCommand@49] - Stat command output [junit] 2015-08-25 10:21:48,296 [myid:] - INFO [NIOWorkerThread-1:NIOServerCnxn@606] - Closed socket connection for client /127.0.0.1:52710 (no session established for client) [junit] 2015-08-25 10:21:48,296 [myid:] - INFO [main:JMXEnv@224] - ensureParent:[InMemoryDataTree, StandaloneServer_port] [junit] 2015-08-25 10:21:48,298 [myid:] - INFO [main:JMXEnv@241] - expect:InMemoryDataTree [junit] 2015-08-25 10:21:48,298 [myid:] - INFO [main:JMXEnv@245] - found:InMemoryDataTree org.apache.ZooKeeperService:name0=StandaloneServer_port11222,name1=InMemoryDataTree [junit] 2015-08-25 10:21:48,299 [myid:] - INFO [main:JMXEnv@241] - expect:StandaloneServer_port [junit] 2015-08-25 10:21:48,299 [myid:] - INFO [main:JMXEnv@245] - found:StandaloneServer_port org.apache.ZooKeeperService:name0=StandaloneServer_port11222 [junit] 2015-08-25 10:21:48,299 [myid:] - INFO [main:JUnit4ZKTestRunner$LoggedInvokeMethod@82] - Memory used 85811 [junit] 2015-08-25 10:21:48,299 [myid:] - INFO [main:JUnit4ZKTestRunner$LoggedInvokeMethod@87] - Number of threads 24 [junit] 2015-08-25 10:21:48,300 [myid:] - INFO [main:JUnit4ZKTestRunner$LoggedInvokeMethod@102] - FINISHED TEST METHOD testQuota [junit] 2015-08-25 10:21:48,300 [myid:] - INFO [main:ClientBase@537] - tearDown starting [junit] 2015-08-25 10:21:48,355 [myid:] - INFO [main:ZooKeeper@1110] - Session: 0x101c7f835f9 closed [junit] 2015-08-25 10:21:48,355 [myid:] - INFO [main:ClientBase@507] - STOPPING server [junit] 2015-08-25 10:21:48,356 [myid:] - INFO [main-EventThread:ClientCnxn$EventThread@542] - EventThread shut down for session: 0x101c7f835f9 [junit] 2015-08-25 10:21:48,356 [myid:] - INFO [ConnnectionExpirer:NIOServerCnxnFactory$ConnectionExpirerThread@583] - ConnnectionExpirerThread interrupted [junit] 2015-08-25 10:21:48,356 [myid:] - INFO [NIOServerCxnFactory.AcceptThread:0.0.0.0/0.0.0.0:11222:NIOServerCnxnFactory$Acce
[jira] [Commented] (ZOOKEEPER-2240) Make the three-node minimum more explicit in documentation and on website
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14711426#comment-14711426 ] Edward Ribeiro commented on ZOOKEEPER-2240: --- LGTM. Definitely a +1 :) Thanks [~rgs] and [~elyograg]! > Make the three-node minimum more explicit in documentation and on website > - > > Key: ZOOKEEPER-2240 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2240 > Project: ZooKeeper > Issue Type: Improvement > Components: documentation >Reporter: Shawn Heisey >Assignee: Shawn Heisey >Priority: Trivial > Fix For: 3.4.7, 3.5.2, 3.6.0 > > Attachments: ZOOKEEPER-2240.patch, ZOOKEEPER-2240.patch > > > One of the most important parts of a production zookeeper deployment is the > three-node minimum requirement for fault tolerance ... but when I glance at > the website and the documentation, this requirement is difficult to actually > find. > It is buried deep in the admin documentation, in a sentence that says "Thus, > a deployment that consists of three machines can handle one failure, and a > deployment of five machines can handle two failures." Other parts of the > documentation hint at it, but nothing that I've seen comes out and explicitly > says it. > Ideally, documentation about this requirement would be in a location where it > can be easily pinpointed with a targeted URL, so I can point to ZK > documentation with a link and clearly tell SolrCloud users that this is a > real requirement. > If someone can point me to version control locations where I can check out or > clone the docs and the website, I'm happy to attempt a patch. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (ZOOKEEPER-2247) Zookeeper service becomes unavailable when leader fails to write transaction log
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arshad Mohammad updated ZOOKEEPER-2247: --- Attachment: ZOOKEEPER-2247-05.patch > Zookeeper service becomes unavailable when leader fails to write transaction > log > > > Key: ZOOKEEPER-2247 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2247 > Project: ZooKeeper > Issue Type: Bug >Affects Versions: 3.5.0 >Reporter: Arshad Mohammad >Assignee: Arshad Mohammad >Priority: Critical > Fix For: 3.5.2 > > Attachments: ZOOKEEPER-2247-01.patch, ZOOKEEPER-2247-02.patch, > ZOOKEEPER-2247-03.patch, ZOOKEEPER-2247-04.patch, ZOOKEEPER-2247-05.patch > > > Zookeeper service becomes unavailable when leader fails to write transaction > log. Bellow are the exceptions > {code} > 2015-08-14 15:41:18,556 [myid:100] - ERROR > [SyncThread:100:ZooKeeperCriticalThread@48] - Severe unrecoverable error, > from thread : SyncThread:100 > java.io.IOException: Input/output error > at sun.nio.ch.FileDispatcherImpl.force0(Native Method) > at sun.nio.ch.FileDispatcherImpl.force(FileDispatcherImpl.java:76) > at sun.nio.ch.FileChannelImpl.force(FileChannelImpl.java:376) > at > org.apache.zookeeper.server.persistence.FileTxnLog.commit(FileTxnLog.java:331) > at > org.apache.zookeeper.server.persistence.FileTxnSnapLog.commit(FileTxnSnapLog.java:380) > at org.apache.zookeeper.server.ZKDatabase.commit(ZKDatabase.java:563) > at > org.apache.zookeeper.server.SyncRequestProcessor.flush(SyncRequestProcessor.java:178) > at > org.apache.zookeeper.server.SyncRequestProcessor.run(SyncRequestProcessor.java:113) > 2015-08-14 15:41:18,559 [myid:100] - INFO > [SyncThread:100:ZooKeeperServer$ZooKeeperServerListenerImpl@500] - Thread > SyncThread:100 exits, error code 1 > 2015-08-14 15:41:18,559 [myid:100] - INFO > [SyncThread:100:ZooKeeperServer@523] - shutting down > 2015-08-14 15:41:18,560 [myid:100] - INFO > [SyncThread:100:SessionTrackerImpl@232] - Shutting down > 2015-08-14 15:41:18,560 [myid:100] - INFO > [SyncThread:100:LeaderRequestProcessor@77] - Shutting down > 2015-08-14 15:41:18,560 [myid:100] - INFO > [SyncThread:100:PrepRequestProcessor@1035] - Shutting down > 2015-08-14 15:41:18,560 [myid:100] - INFO > [SyncThread:100:ProposalRequestProcessor@88] - Shutting down > 2015-08-14 15:41:18,561 [myid:100] - INFO > [SyncThread:100:CommitProcessor@356] - Shutting down > 2015-08-14 15:41:18,561 [myid:100] - INFO > [CommitProcessor:100:CommitProcessor@191] - CommitProcessor exited loop! > 2015-08-14 15:41:18,562 [myid:100] - INFO > [SyncThread:100:Leader$ToBeAppliedRequestProcessor@915] - Shutting down > 2015-08-14 15:41:18,562 [myid:100] - INFO > [SyncThread:100:FinalRequestProcessor@646] - shutdown of request processor > complete > 2015-08-14 15:41:18,562 [myid:100] - INFO > [SyncThread:100:SyncRequestProcessor@191] - Shutting down > 2015-08-14 15:41:18,563 [myid:100] - INFO [ProcessThread(sid:100 > cport:-1)::PrepRequestProcessor@159] - PrepRequestProcessor exited loop! > {code} > After this exception Leader server still remains leader. After this non > recoverable exception the leader should go down and let other followers > become leader. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (ZOOKEEPER-2247) Zookeeper service becomes unavailable when leader fails to write transaction log
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14711463#comment-14711463 ] Arshad Mohammad commented on ZOOKEEPER-2247: Thanks [~rakeshr] for your suggestion. I was using private accessors to get {{FileTxnSnapLog}} but this I can get even from {{ZooKeeperServer}}. So actually neither private accessor nor mock is required. Please find the fix in ZOOKEEPER-2247-05.patch > Zookeeper service becomes unavailable when leader fails to write transaction > log > > > Key: ZOOKEEPER-2247 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2247 > Project: ZooKeeper > Issue Type: Bug >Affects Versions: 3.5.0 >Reporter: Arshad Mohammad >Assignee: Arshad Mohammad >Priority: Critical > Fix For: 3.5.2 > > Attachments: ZOOKEEPER-2247-01.patch, ZOOKEEPER-2247-02.patch, > ZOOKEEPER-2247-03.patch, ZOOKEEPER-2247-04.patch, ZOOKEEPER-2247-05.patch > > > Zookeeper service becomes unavailable when leader fails to write transaction > log. Bellow are the exceptions > {code} > 2015-08-14 15:41:18,556 [myid:100] - ERROR > [SyncThread:100:ZooKeeperCriticalThread@48] - Severe unrecoverable error, > from thread : SyncThread:100 > java.io.IOException: Input/output error > at sun.nio.ch.FileDispatcherImpl.force0(Native Method) > at sun.nio.ch.FileDispatcherImpl.force(FileDispatcherImpl.java:76) > at sun.nio.ch.FileChannelImpl.force(FileChannelImpl.java:376) > at > org.apache.zookeeper.server.persistence.FileTxnLog.commit(FileTxnLog.java:331) > at > org.apache.zookeeper.server.persistence.FileTxnSnapLog.commit(FileTxnSnapLog.java:380) > at org.apache.zookeeper.server.ZKDatabase.commit(ZKDatabase.java:563) > at > org.apache.zookeeper.server.SyncRequestProcessor.flush(SyncRequestProcessor.java:178) > at > org.apache.zookeeper.server.SyncRequestProcessor.run(SyncRequestProcessor.java:113) > 2015-08-14 15:41:18,559 [myid:100] - INFO > [SyncThread:100:ZooKeeperServer$ZooKeeperServerListenerImpl@500] - Thread > SyncThread:100 exits, error code 1 > 2015-08-14 15:41:18,559 [myid:100] - INFO > [SyncThread:100:ZooKeeperServer@523] - shutting down > 2015-08-14 15:41:18,560 [myid:100] - INFO > [SyncThread:100:SessionTrackerImpl@232] - Shutting down > 2015-08-14 15:41:18,560 [myid:100] - INFO > [SyncThread:100:LeaderRequestProcessor@77] - Shutting down > 2015-08-14 15:41:18,560 [myid:100] - INFO > [SyncThread:100:PrepRequestProcessor@1035] - Shutting down > 2015-08-14 15:41:18,560 [myid:100] - INFO > [SyncThread:100:ProposalRequestProcessor@88] - Shutting down > 2015-08-14 15:41:18,561 [myid:100] - INFO > [SyncThread:100:CommitProcessor@356] - Shutting down > 2015-08-14 15:41:18,561 [myid:100] - INFO > [CommitProcessor:100:CommitProcessor@191] - CommitProcessor exited loop! > 2015-08-14 15:41:18,562 [myid:100] - INFO > [SyncThread:100:Leader$ToBeAppliedRequestProcessor@915] - Shutting down > 2015-08-14 15:41:18,562 [myid:100] - INFO > [SyncThread:100:FinalRequestProcessor@646] - shutdown of request processor > complete > 2015-08-14 15:41:18,562 [myid:100] - INFO > [SyncThread:100:SyncRequestProcessor@191] - Shutting down > 2015-08-14 15:41:18,563 [myid:100] - INFO [ProcessThread(sid:100 > cport:-1)::PrepRequestProcessor@159] - PrepRequestProcessor exited loop! > {code} > After this exception Leader server still remains leader. After this non > recoverable exception the leader should go down and let other followers > become leader. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (ZOOKEEPER-2247) Zookeeper service becomes unavailable when leader fails to write transaction log
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14711483#comment-14711483 ] Arshad Mohammad commented on ZOOKEEPER-2247: # fixed # fixed # fixed, {{ConfigBaseSystemTest}} not necessary for this patch. Since this patch is not testing any configuration related issue, I used {{BaseSysTest)) instead of {{QuorumPeerTestBase}} # fixed # fixed > Zookeeper service becomes unavailable when leader fails to write transaction > log > > > Key: ZOOKEEPER-2247 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2247 > Project: ZooKeeper > Issue Type: Bug >Affects Versions: 3.5.0 >Reporter: Arshad Mohammad >Assignee: Arshad Mohammad >Priority: Critical > Fix For: 3.5.2 > > Attachments: ZOOKEEPER-2247-01.patch, ZOOKEEPER-2247-02.patch, > ZOOKEEPER-2247-03.patch, ZOOKEEPER-2247-04.patch, ZOOKEEPER-2247-05.patch > > > Zookeeper service becomes unavailable when leader fails to write transaction > log. Bellow are the exceptions > {code} > 2015-08-14 15:41:18,556 [myid:100] - ERROR > [SyncThread:100:ZooKeeperCriticalThread@48] - Severe unrecoverable error, > from thread : SyncThread:100 > java.io.IOException: Input/output error > at sun.nio.ch.FileDispatcherImpl.force0(Native Method) > at sun.nio.ch.FileDispatcherImpl.force(FileDispatcherImpl.java:76) > at sun.nio.ch.FileChannelImpl.force(FileChannelImpl.java:376) > at > org.apache.zookeeper.server.persistence.FileTxnLog.commit(FileTxnLog.java:331) > at > org.apache.zookeeper.server.persistence.FileTxnSnapLog.commit(FileTxnSnapLog.java:380) > at org.apache.zookeeper.server.ZKDatabase.commit(ZKDatabase.java:563) > at > org.apache.zookeeper.server.SyncRequestProcessor.flush(SyncRequestProcessor.java:178) > at > org.apache.zookeeper.server.SyncRequestProcessor.run(SyncRequestProcessor.java:113) > 2015-08-14 15:41:18,559 [myid:100] - INFO > [SyncThread:100:ZooKeeperServer$ZooKeeperServerListenerImpl@500] - Thread > SyncThread:100 exits, error code 1 > 2015-08-14 15:41:18,559 [myid:100] - INFO > [SyncThread:100:ZooKeeperServer@523] - shutting down > 2015-08-14 15:41:18,560 [myid:100] - INFO > [SyncThread:100:SessionTrackerImpl@232] - Shutting down > 2015-08-14 15:41:18,560 [myid:100] - INFO > [SyncThread:100:LeaderRequestProcessor@77] - Shutting down > 2015-08-14 15:41:18,560 [myid:100] - INFO > [SyncThread:100:PrepRequestProcessor@1035] - Shutting down > 2015-08-14 15:41:18,560 [myid:100] - INFO > [SyncThread:100:ProposalRequestProcessor@88] - Shutting down > 2015-08-14 15:41:18,561 [myid:100] - INFO > [SyncThread:100:CommitProcessor@356] - Shutting down > 2015-08-14 15:41:18,561 [myid:100] - INFO > [CommitProcessor:100:CommitProcessor@191] - CommitProcessor exited loop! > 2015-08-14 15:41:18,562 [myid:100] - INFO > [SyncThread:100:Leader$ToBeAppliedRequestProcessor@915] - Shutting down > 2015-08-14 15:41:18,562 [myid:100] - INFO > [SyncThread:100:FinalRequestProcessor@646] - shutdown of request processor > complete > 2015-08-14 15:41:18,562 [myid:100] - INFO > [SyncThread:100:SyncRequestProcessor@191] - Shutting down > 2015-08-14 15:41:18,563 [myid:100] - INFO [ProcessThread(sid:100 > cport:-1)::PrepRequestProcessor@159] - PrepRequestProcessor exited loop! > {code} > After this exception Leader server still remains leader. After this non > recoverable exception the leader should go down and let other followers > become leader. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Success: ZOOKEEPER-2247 PreCommit Build #2841
Jira: https://issues.apache.org/jira/browse/ZOOKEEPER-2247 Build: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/2841/ ### ## LAST 60 LINES OF THE CONSOLE ### [...truncated 399101 lines...] [exec] +1 overall. Here are the results of testing the latest attachment [exec] http://issues.apache.org/jira/secure/attachment/12752256/ZOOKEEPER-2247-05.patch [exec] against trunk revision 1697551. [exec] [exec] +1 @author. The patch does not contain any @author tags. [exec] [exec] +1 tests included. The patch appears to include 20 new or modified tests. [exec] [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] [exec] +1 findbugs. The patch does not introduce any new Findbugs (version 2.0.3) warnings. [exec] [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings. [exec] [exec] +1 core tests. The patch passed core unit tests. [exec] [exec] +1 contrib tests. The patch passed contrib unit tests. [exec] [exec] Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/2841//testReport/ [exec] Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/2841//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html [exec] Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/2841//console [exec] [exec] This message is automatically generated. [exec] [exec] [exec] == [exec] == [exec] Adding comment to Jira. [exec] == [exec] == [exec] [exec] [exec] Comment added. [exec] 4e65c1e7fdba4874a64235eb8204cc62beb59658 logged out [exec] [exec] [exec] == [exec] == [exec] Finished build. [exec] == [exec] == [exec] [exec] BUILD SUCCESSFUL Total time: 13 minutes 26 seconds Archiving artifacts Sending artifact delta relative to PreCommit-ZOOKEEPER-Build #2835 Archived 24 artifacts Archive block size is 32768 Received 4 blocks and 33865694 bytes Compression is 0.4% Took 7 sec Recording test results Description set: ZOOKEEPER-2247 Email was triggered for: Success Sending email for trigger: Success ### ## FAILED TESTS (if any) ## All tests passed
[jira] [Commented] (ZOOKEEPER-2247) Zookeeper service becomes unavailable when leader fails to write transaction log
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14711495#comment-14711495 ] Hadoop QA commented on ZOOKEEPER-2247: -- +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12752256/ZOOKEEPER-2247-05.patch against trunk revision 1697551. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 20 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 2.0.3) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/2841//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/2841//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/2841//console This message is automatically generated. > Zookeeper service becomes unavailable when leader fails to write transaction > log > > > Key: ZOOKEEPER-2247 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2247 > Project: ZooKeeper > Issue Type: Bug >Affects Versions: 3.5.0 >Reporter: Arshad Mohammad >Assignee: Arshad Mohammad >Priority: Critical > Fix For: 3.5.2 > > Attachments: ZOOKEEPER-2247-01.patch, ZOOKEEPER-2247-02.patch, > ZOOKEEPER-2247-03.patch, ZOOKEEPER-2247-04.patch, ZOOKEEPER-2247-05.patch > > > Zookeeper service becomes unavailable when leader fails to write transaction > log. Bellow are the exceptions > {code} > 2015-08-14 15:41:18,556 [myid:100] - ERROR > [SyncThread:100:ZooKeeperCriticalThread@48] - Severe unrecoverable error, > from thread : SyncThread:100 > java.io.IOException: Input/output error > at sun.nio.ch.FileDispatcherImpl.force0(Native Method) > at sun.nio.ch.FileDispatcherImpl.force(FileDispatcherImpl.java:76) > at sun.nio.ch.FileChannelImpl.force(FileChannelImpl.java:376) > at > org.apache.zookeeper.server.persistence.FileTxnLog.commit(FileTxnLog.java:331) > at > org.apache.zookeeper.server.persistence.FileTxnSnapLog.commit(FileTxnSnapLog.java:380) > at org.apache.zookeeper.server.ZKDatabase.commit(ZKDatabase.java:563) > at > org.apache.zookeeper.server.SyncRequestProcessor.flush(SyncRequestProcessor.java:178) > at > org.apache.zookeeper.server.SyncRequestProcessor.run(SyncRequestProcessor.java:113) > 2015-08-14 15:41:18,559 [myid:100] - INFO > [SyncThread:100:ZooKeeperServer$ZooKeeperServerListenerImpl@500] - Thread > SyncThread:100 exits, error code 1 > 2015-08-14 15:41:18,559 [myid:100] - INFO > [SyncThread:100:ZooKeeperServer@523] - shutting down > 2015-08-14 15:41:18,560 [myid:100] - INFO > [SyncThread:100:SessionTrackerImpl@232] - Shutting down > 2015-08-14 15:41:18,560 [myid:100] - INFO > [SyncThread:100:LeaderRequestProcessor@77] - Shutting down > 2015-08-14 15:41:18,560 [myid:100] - INFO > [SyncThread:100:PrepRequestProcessor@1035] - Shutting down > 2015-08-14 15:41:18,560 [myid:100] - INFO > [SyncThread:100:ProposalRequestProcessor@88] - Shutting down > 2015-08-14 15:41:18,561 [myid:100] - INFO > [SyncThread:100:CommitProcessor@356] - Shutting down > 2015-08-14 15:41:18,561 [myid:100] - INFO > [CommitProcessor:100:CommitProcessor@191] - CommitProcessor exited loop! > 2015-08-14 15:41:18,562 [myid:100] - INFO > [SyncThread:100:Leader$ToBeAppliedRequestProcessor@915] - Shutting down > 2015-08-14 15:41:18,562 [myid:100] - INFO > [SyncThread:100:FinalRequestProcessor@646] - shutdown of request processor > complete > 2015-08-14 15:41:18,562 [myid:100] - INFO > [SyncThread:100:SyncRequestProcessor@191] - Shutting down > 2015-08-14 15:41:18,563 [myid:100] - INFO [ProcessThread(sid:100 > cport:-1)::PrepRequestProcessor@159] - PrepRequestProcessor exited loop! > {code} > After this exception Leader server still remains leader. After this non > recoverable exception the leader should go down and let other followers > become leader. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (ZOOKEEPER-1506) Re-try DNS hostname -> IP resolution if node connection fails
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14711955#comment-14711955 ] Robert P. Thille commented on ZOOKEEPER-1506: - I've got a patch (against release-3.4.6) which we're using in-house which includes fixes to the tests. Not sure how applicable it'd be to the 3.4 branch (we wanted minimal changes to the stable release). I had to add one more call to s.recreateSocketAddresses() in Learner.java to get it to function properly with my (not-included, too dependent on our test environment) integration tests. I'm sending a request to Legal to get the release approval (likely a rubber-stamp). > Re-try DNS hostname -> IP resolution if node connection fails > - > > Key: ZOOKEEPER-1506 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1506 > Project: ZooKeeper > Issue Type: Improvement > Components: server >Affects Versions: 3.4.5 > Environment: Ubuntu 11.04 64-bit >Reporter: Mike Heffner >Assignee: Raul Gutierrez Segales >Priority: Critical > Labels: patch > Fix For: 3.4.7, 3.5.1, 3.6.0 > > Attachments: ZOOKEEPER-1506-fix.patch, ZOOKEEPER-1506.patch, > ZOOKEEPER-1506.patch, ZOOKEEPER-1506.patch, ZOOKEEPER-1506.patch, > ZOOKEEPER-1506.patch, ZOOKEEPER-1506.patch, ZOOKEEPER-1506.patch, > zk-dns-caching-refresh.patch > > >In our zoo.cfg we use hostnames to identify the ZK servers that are part > of an ensemble. These hostnames are configured with a low (<= 60s) TTL and > the IP address they map to can and does change. Our procedure for > replacing/upgrading a ZK node is to boot an entirely new instance and remap > the hostname to the new instance's IP address. Our expectation is that when > the original ZK node is terminated/shutdown, the remaining nodes in the > ensemble would reconnect to the new instance. > However, what we are noticing is that the remaining ZK nodes do not attempt > to re-resolve the hostname->IP mapping for the new server. Once the original > ZK node is terminated, the existing servers continue to attempt contacting it > at the old IP address. It would be great if the ZK servers could try to > re-resolve the hostname when attempting to connect to a lost ZK server, > instead of caching the lookup indefinitely. Currently we must do a rolling > restart of the ZK ensemble after swapping a node -- which at three nodes > means we periodically lose quorum. > The exact method we are following is to boot new instances in EC2 and attach > one, of a set of three, Elastic IP address. External to EC2 this IP address > remains the same and maps to whatever instance it is attached to. Internal to > EC2, the elastic IP hostname has a TTL of about 45-60 seconds and is remapped > to the internal (10.x.y.z) address of the instance it is attached to. > Therefore, in our case we would like ZK to pickup the new 10.x.y.z address > that the elastic IP hostname gets mapped to and reconnect appropriately. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (ZOOKEEPER-2257) Make zookeeper server principal configurable at zookeeper client side
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14712437#comment-14712437 ] caixiaofeng commented on ZOOKEEPER-2257: the same. and if set as " zookeeper/hadoop.foo.com" directly, cross realm will grt problem, > Make zookeeper server principal configurable at zookeeper client side > - > > Key: ZOOKEEPER-2257 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2257 > Project: ZooKeeper > Issue Type: Improvement >Reporter: Arshad Mohammad >Assignee: Arshad Mohammad > > Currently Zookeeper client expects zookeeper server's principal to be in the > form of zookeeper.sasl.client.username/server-ip for example > zookeeper/192.162.1.100. > But this may not always be the case server principal can be some thing like > zookeeper/hadoop.foo.com > It would be better if we can make server principal configurable. > Current Code: > {code} > String principalUserName = System.getProperty(ZK_SASL_CLIENT_USERNAME, > "zookeeper"); > zooKeeperSaslClient = new ZooKeeperSaslClient(principalUserName + "/" + > addr.getHostString()); > {code} > Proposed Code: > {code} > String serverPrincipal = System.getProperty("zookeeper.server.principal"); > if (null != serverPrincipal) { > zooKeeperSaslClient = new ZooKeeperSaslClient(serverPrincipal); > } else { > String principalUserName = System.getProperty(ZK_SASL_CLIENT_USERNAME, > "zookeeper"); > zooKeeperSaslClient = new ZooKeeperSaslClient(principalUserName + "/" + > addr.getHostString()); > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (ZOOKEEPER-2247) Zookeeper service becomes unavailable when leader fails to write transaction log
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14712438#comment-14712438 ] caixiaofeng commented on ZOOKEEPER-2247: mark > Zookeeper service becomes unavailable when leader fails to write transaction > log > > > Key: ZOOKEEPER-2247 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2247 > Project: ZooKeeper > Issue Type: Bug >Affects Versions: 3.5.0 >Reporter: Arshad Mohammad >Assignee: Arshad Mohammad >Priority: Critical > Fix For: 3.5.2 > > Attachments: ZOOKEEPER-2247-01.patch, ZOOKEEPER-2247-02.patch, > ZOOKEEPER-2247-03.patch, ZOOKEEPER-2247-04.patch, ZOOKEEPER-2247-05.patch > > > Zookeeper service becomes unavailable when leader fails to write transaction > log. Bellow are the exceptions > {code} > 2015-08-14 15:41:18,556 [myid:100] - ERROR > [SyncThread:100:ZooKeeperCriticalThread@48] - Severe unrecoverable error, > from thread : SyncThread:100 > java.io.IOException: Input/output error > at sun.nio.ch.FileDispatcherImpl.force0(Native Method) > at sun.nio.ch.FileDispatcherImpl.force(FileDispatcherImpl.java:76) > at sun.nio.ch.FileChannelImpl.force(FileChannelImpl.java:376) > at > org.apache.zookeeper.server.persistence.FileTxnLog.commit(FileTxnLog.java:331) > at > org.apache.zookeeper.server.persistence.FileTxnSnapLog.commit(FileTxnSnapLog.java:380) > at org.apache.zookeeper.server.ZKDatabase.commit(ZKDatabase.java:563) > at > org.apache.zookeeper.server.SyncRequestProcessor.flush(SyncRequestProcessor.java:178) > at > org.apache.zookeeper.server.SyncRequestProcessor.run(SyncRequestProcessor.java:113) > 2015-08-14 15:41:18,559 [myid:100] - INFO > [SyncThread:100:ZooKeeperServer$ZooKeeperServerListenerImpl@500] - Thread > SyncThread:100 exits, error code 1 > 2015-08-14 15:41:18,559 [myid:100] - INFO > [SyncThread:100:ZooKeeperServer@523] - shutting down > 2015-08-14 15:41:18,560 [myid:100] - INFO > [SyncThread:100:SessionTrackerImpl@232] - Shutting down > 2015-08-14 15:41:18,560 [myid:100] - INFO > [SyncThread:100:LeaderRequestProcessor@77] - Shutting down > 2015-08-14 15:41:18,560 [myid:100] - INFO > [SyncThread:100:PrepRequestProcessor@1035] - Shutting down > 2015-08-14 15:41:18,560 [myid:100] - INFO > [SyncThread:100:ProposalRequestProcessor@88] - Shutting down > 2015-08-14 15:41:18,561 [myid:100] - INFO > [SyncThread:100:CommitProcessor@356] - Shutting down > 2015-08-14 15:41:18,561 [myid:100] - INFO > [CommitProcessor:100:CommitProcessor@191] - CommitProcessor exited loop! > 2015-08-14 15:41:18,562 [myid:100] - INFO > [SyncThread:100:Leader$ToBeAppliedRequestProcessor@915] - Shutting down > 2015-08-14 15:41:18,562 [myid:100] - INFO > [SyncThread:100:FinalRequestProcessor@646] - shutdown of request processor > complete > 2015-08-14 15:41:18,562 [myid:100] - INFO > [SyncThread:100:SyncRequestProcessor@191] - Shutting down > 2015-08-14 15:41:18,563 [myid:100] - INFO [ProcessThread(sid:100 > cport:-1)::PrepRequestProcessor@159] - PrepRequestProcessor exited loop! > {code} > After this exception Leader server still remains leader. After this non > recoverable exception the leader should go down and let other followers > become leader. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (ZOOKEEPER-2224) Four letter command hangs when network is slow
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14712440#comment-14712440 ] caixiaofeng commented on ZOOKEEPER-2224: mark > Four letter command hangs when network is slow > -- > > Key: ZOOKEEPER-2224 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2224 > Project: ZooKeeper > Issue Type: Bug > Components: java client >Reporter: Arshad Mohammad >Assignee: Arshad Mohammad >Priority: Minor > Fix For: 3.4.7, 3.5.1, 3.6.0 > > Attachments: ZOOKEEPER-2224-01.patch, ZOOKEEPER-2224-02.patch, > ZOOKEEPER-2224-03.patch, ZOOKEEPER-2224-04.patch, > ZOOKEEPER-2224_br_3_4-04.patch > > > Four letter command hangs when network is slow or network goes down in > between the operation, and the application also, which calling this four > letter command, hangs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)