ZooKeeper_branch34_solaris - Build # 87 - Still Failing
See https://builds.apache.org/job/ZooKeeper_branch34_solaris/87/ ### ## LAST 60 LINES OF THE CONSOLE ### [...truncated 131509 lines...] [junit] 2012-01-23 07:57:47,959 [myid:] - INFO [main:JMXEnv@105] - expect:StandaloneServer_port [junit] 2012-01-23 07:57:47,959 [myid:] - INFO [main:JMXEnv@108] - found:StandaloneServer_port org.apache.ZooKeeperService:name0=StandaloneServer_port-1 [junit] 2012-01-23 07:57:47,959 [myid:] - INFO [main:ClientBase@398] - STOPPING server [junit] 2012-01-23 07:57:47,960 [myid:] - INFO [main:ZooKeeperServer@420] - shutting down [junit] 2012-01-23 07:57:47,961 [myid:] - INFO [main:SessionTrackerImpl@220] - Shutting down [junit] 2012-01-23 07:57:47,962 [myid:] - INFO [main:PrepRequestProcessor@733] - Shutting down [junit] 2012-01-23 07:57:47,962 [myid:] - INFO [main:SyncRequestProcessor@173] - Shutting down [junit] 2012-01-23 07:57:47,962 [myid:] - INFO [ProcessThread(sid:0 cport:-1)::PrepRequestProcessor@135] - PrepRequestProcessor exited loop! [junit] 2012-01-23 07:57:47,963 [myid:] - INFO [SyncThread:0:SyncRequestProcessor@155] - SyncRequestProcessor exited! [junit] 2012-01-23 07:57:47,963 [myid:] - INFO [main:FinalRequestProcessor@423] - shutdown of request processor complete [junit] 2012-01-23 07:57:47,965 [myid:] - INFO [main:FourLetterWordMain@43] - connecting to 127.0.0.1 11221 [junit] 2012-01-23 07:57:47,966 [myid:] - INFO [main:JMXEnv@133] - ensureOnly:[] [junit] 2012-01-23 07:57:47,967 [myid:] - INFO [main:ClientBase@391] - STARTING server [junit] 2012-01-23 07:57:47,967 [myid:] - INFO [main:ZooKeeperServer@168] - Created server with tickTime 3000 minSessionTimeout 6000 maxSessionTimeout 6 datadir /zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/ZooKeeper_branch34_solaris/trunk/build/test/tmp/test5058074880548269645.junit.dir/version-2 snapdir /zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/ZooKeeper_branch34_solaris/trunk/build/test/tmp/test5058074880548269645.junit.dir/version-2 [junit] 2012-01-23 07:57:47,968 [myid:] - INFO [main:NIOServerCnxnFactory@110] - binding to port 0.0.0.0/0.0.0.0:11221 [junit] 2012-01-23 07:57:47,970 [myid:] - INFO [main:FileSnap@83] - Reading snapshot /zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/ZooKeeper_branch34_solaris/trunk/build/test/tmp/test5058074880548269645.junit.dir/version-2/snapshot.b [junit] 2012-01-23 07:57:47,972 [myid:] - INFO [main:FileTxnSnapLog@237] - Snapshotting: b [junit] 2012-01-23 07:57:47,974 [myid:] - INFO [main:FourLetterWordMain@43] - connecting to 127.0.0.1 11221 [junit] 2012-01-23 07:57:47,975 [myid:] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:11221:NIOServerCnxnFactory@213] - Accepted socket connection from /127.0.0.1:33724 [junit] 2012-01-23 07:57:47,975 [myid:] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:11221:NIOServerCnxn@820] - Processing stat command from /127.0.0.1:33724 [junit] 2012-01-23 07:57:47,976 [myid:] - INFO [Thread-5:NIOServerCnxn$StatCommand@655] - Stat command output [junit] 2012-01-23 07:57:47,977 [myid:] - INFO [Thread-5:NIOServerCnxn@1000] - Closed socket connection for client /127.0.0.1:33724 (no session established for client) [junit] 2012-01-23 07:57:47,977 [myid:] - INFO [main:JMXEnv@133] - ensureOnly:[InMemoryDataTree, StandaloneServer_port] [junit] 2012-01-23 07:57:47,979 [myid:] - INFO [main:JMXEnv@105] - expect:InMemoryDataTree [junit] 2012-01-23 07:57:47,979 [myid:] - INFO [main:JMXEnv@108] - found:InMemoryDataTree org.apache.ZooKeeperService:name0=StandaloneServer_port-1,name1=InMemoryDataTree [junit] 2012-01-23 07:57:47,980 [myid:] - INFO [main:JMXEnv@105] - expect:StandaloneServer_port [junit] 2012-01-23 07:57:47,980 [myid:] - INFO [main:JMXEnv@108] - found:StandaloneServer_port org.apache.ZooKeeperService:name0=StandaloneServer_port-1 [junit] 2012-01-23 07:57:47,981 [myid:] - INFO [main:JUnit4ZKTestRunner$LoggedInvokeMethod@57] - FINISHED TEST METHOD testQuota [junit] 2012-01-23 07:57:47,981 [myid:] - INFO [main:ClientBase@428] - tearDown starting [junit] 2012-01-23 07:57:48,003 [myid:] - INFO [SessionTracker:SessionTrackerImpl@162] - SessionTrackerImpl exited loop! [junit] 2012-01-23 07:57:48,003 [myid:] - INFO [SessionTracker:SessionTrackerImpl@162] - SessionTrackerImpl exited loop! [junit] 2012-01-23 07:57:48,044 [myid:] - INFO [main:ZooKeeper@679] - Session: 0x13509922609 closed [junit] 2012-01-23 07:57:48,044 [myid:] - INFO [main-EventThread:ClientCnxn$EventThread@511] - EventThread shut down [junit] 2012-01-23 07:57:48,044 [myid:] - INFO [main:ClientBase@398] - STOPPING server [junit] 2012-01-23 07:57:48,047 [myid:] - INFO [main:ZooKeeperServer@420] - shutting down
ZooKeeper-trunk-solaris - Build # 113 - Failure
See https://builds.apache.org/job/ZooKeeper-trunk-solaris/113/ ### ## LAST 60 LINES OF THE CONSOLE ### [...truncated 144741 lines...] [junit] 2012-01-23 09:01:38,222 [myid:] - INFO [main:JMXEnv@105] - expect:InMemoryDataTree [junit] 2012-01-23 09:01:38,222 [myid:] - INFO [main:JMXEnv@108] - found:InMemoryDataTree org.apache.ZooKeeperService:name0=StandaloneServer_port-1,name1=InMemoryDataTree [junit] 2012-01-23 09:01:38,223 [myid:] - INFO [main:JMXEnv@105] - expect:StandaloneServer_port [junit] 2012-01-23 09:01:38,223 [myid:] - INFO [main:JMXEnv@108] - found:StandaloneServer_port org.apache.ZooKeeperService:name0=StandaloneServer_port-1 [junit] 2012-01-23 09:01:38,223 [myid:] - INFO [main:ClientBase@417] - STOPPING server [junit] 2012-01-23 09:01:38,226 [myid:] - INFO [main:ZooKeeperServer@391] - shutting down [junit] 2012-01-23 09:01:38,226 [myid:] - INFO [main:SessionTrackerImpl@220] - Shutting down [junit] 2012-01-23 09:01:38,226 [myid:] - INFO [main:PrepRequestProcessor@711] - Shutting down [junit] 2012-01-23 09:01:38,226 [myid:] - INFO [main:SyncRequestProcessor@173] - Shutting down [junit] 2012-01-23 09:01:38,226 [myid:] - INFO [ProcessThread(sid:0 cport:-1)::PrepRequestProcessor@134] - PrepRequestProcessor exited loop! [junit] 2012-01-23 09:01:38,226 [myid:] - INFO [SyncThread:0:SyncRequestProcessor@155] - SyncRequestProcessor exited! [junit] 2012-01-23 09:01:38,226 [myid:] - INFO [main:FinalRequestProcessor@419] - shutdown of request processor complete [junit] 2012-01-23 09:01:38,227 [myid:] - INFO [main:FourLetterWordMain@43] - connecting to 127.0.0.1 11221 [junit] 2012-01-23 09:01:38,227 [myid:] - INFO [main:JMXEnv@133] - ensureOnly:[] [junit] 2012-01-23 09:01:38,228 [myid:] - INFO [main:ClientBase@410] - STARTING server [junit] 2012-01-23 09:01:38,229 [myid:] - INFO [main:ZooKeeperServer@143] - Created server with tickTime 3000 minSessionTimeout 6000 maxSessionTimeout 6 datadir /zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/ZooKeeper-trunk-solaris/trunk/build/test/tmp/test3854562534136324196.junit.dir/version-2 snapdir /zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/ZooKeeper-trunk-solaris/trunk/build/test/tmp/test3854562534136324196.junit.dir/version-2 [junit] 2012-01-23 09:01:38,229 [myid:] - INFO [main:NIOServerCnxnFactory@110] - binding to port 0.0.0.0/0.0.0.0:11221 [junit] 2012-01-23 09:01:38,230 [myid:] - INFO [main:FileSnap@83] - Reading snapshot /zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/ZooKeeper-trunk-solaris/trunk/build/test/tmp/test3854562534136324196.junit.dir/version-2/snapshot.b [junit] 2012-01-23 09:01:38,231 [myid:] - INFO [main:FileTxnSnapLog@237] - Snapshotting: b [junit] 2012-01-23 09:01:38,233 [myid:] - INFO [main:FourLetterWordMain@43] - connecting to 127.0.0.1 11221 [junit] 2012-01-23 09:01:38,233 [myid:] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:11221:NIOServerCnxnFactory@213] - Accepted socket connection from /127.0.0.1:40759 [junit] 2012-01-23 09:01:38,233 [myid:] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:11221:NIOServerCnxn@820] - Processing stat command from /127.0.0.1:40759 [junit] 2012-01-23 09:01:38,233 [myid:] - INFO [Thread-5:NIOServerCnxn$StatCommand@655] - Stat command output [junit] 2012-01-23 09:01:38,234 [myid:] - INFO [Thread-5:NIOServerCnxn@1000] - Closed socket connection for client /127.0.0.1:40759 (no session established for client) [junit] 2012-01-23 09:01:38,234 [myid:] - INFO [main:JMXEnv@133] - ensureOnly:[InMemoryDataTree, StandaloneServer_port] [junit] 2012-01-23 09:01:38,235 [myid:] - INFO [main:JMXEnv@105] - expect:InMemoryDataTree [junit] 2012-01-23 09:01:38,235 [myid:] - INFO [main:JMXEnv@108] - found:InMemoryDataTree org.apache.ZooKeeperService:name0=StandaloneServer_port-1,name1=InMemoryDataTree [junit] 2012-01-23 09:01:38,235 [myid:] - INFO [main:JMXEnv@105] - expect:StandaloneServer_port [junit] 2012-01-23 09:01:38,235 [myid:] - INFO [main:JMXEnv@108] - found:StandaloneServer_port org.apache.ZooKeeperService:name0=StandaloneServer_port-1 [junit] 2012-01-23 09:01:38,236 [myid:] - INFO [main:JUnit4ZKTestRunner$LoggedInvokeMethod@57] - FINISHED TEST METHOD testQuota [junit] 2012-01-23 09:01:38,236 [myid:] - INFO [main:ClientBase@447] - tearDown starting [junit] 2012-01-23 09:01:38,569 [myid:] - INFO [main:ZooKeeper@679] - Session: 0x13509cc9165 closed [junit] 2012-01-23 09:01:38,569 [myid:] - INFO [main-EventThread:ClientCnxn$EventThread@511] - EventThread shut down [junit] 2012-01-23 09:01:38,569 [myid:] - INFO [main:ClientBase@417] - STOPPING server [junit] 2012-01-23 09:01:38,571 [myid:] - INFO [main:ZooKeeperServer@391] -
[jira] [Commented] (ZOOKEEPER-1366) Zookeeper should be tolerant of clock adjustments
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13191371#comment-13191371 ] Patrick Hunt commented on ZOOKEEPER-1366: - See ZOOKEEPER-366 for a previous discussion/solution for this issue. Would be good to close that one out if this is addressing the issue more directly. Zookeeper should be tolerant of clock adjustments - Key: ZOOKEEPER-1366 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1366 Project: ZooKeeper Issue Type: Bug Reporter: Ted Dunning Fix For: 3.4.3 Attachments: ZOOKEEPER-1366-3.3.3.patch, ZOOKEEPER-1366.patch, ZOOKEEPER-1366.patch, ZOOKEEPER-1366.patch, ZOOKEEPER-1366.patch If you want to wreak havoc on a ZK based system just do [date -s +1hour] and watch the mayhem as all sessions expire at once. This shouldn't happen. Zookeeper could easily know handle elapsed times as elapsed times rather than as differences between absolute times. The absolute times are subject to adjustment when the clock is set while a timer is not subject to this problem. In Java, System.currentTimeMillis() gives you absolute time while System.nanoTime() gives you time based on a timer from an arbitrary epoch. I have done this and have been running tests now for some tens of minutes with no failures. I will set up a test machine to redo the build again on Ubuntu and post a patch here for discussion. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (ZOOKEEPER-1366) Zookeeper should be tolerant of clock adjustments
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Hunt reassigned ZOOKEEPER-1366: --- Assignee: Ted Dunning Zookeeper should be tolerant of clock adjustments - Key: ZOOKEEPER-1366 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1366 Project: ZooKeeper Issue Type: Bug Reporter: Ted Dunning Assignee: Ted Dunning Fix For: 3.4.3 Attachments: ZOOKEEPER-1366-3.3.3.patch, ZOOKEEPER-1366.patch, ZOOKEEPER-1366.patch, ZOOKEEPER-1366.patch, ZOOKEEPER-1366.patch If you want to wreak havoc on a ZK based system just do [date -s +1hour] and watch the mayhem as all sessions expire at once. This shouldn't happen. Zookeeper could easily know handle elapsed times as elapsed times rather than as differences between absolute times. The absolute times are subject to adjustment when the clock is set while a timer is not subject to this problem. In Java, System.currentTimeMillis() gives you absolute time while System.nanoTime() gives you time based on a timer from an arbitrary epoch. I have done this and have been running tests now for some tens of minutes with no failures. I will set up a test machine to redo the build again on Ubuntu and post a patch here for discussion. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (ZOOKEEPER-1366) Zookeeper should be tolerant of clock adjustments
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13191376#comment-13191376 ] Henry Robinson commented on ZOOKEEPER-1366: --- My feeling is that Ted's fixing a legitimate issue here, so we shouldn't hold up the patch for a separate effort. Reworking how we deal with time is going to be a big effort (Thread.sleep really does complicate things, plus there's the question of how to actually inject a mock clock - as you say, such method calls would need to be non-static but then we need to figure out how to get the right implementation behind those methods). This patch doesn't get in the way of doing a better job with time, and gives us the beginnings of a nice integration point to mock clocks out. So I'll file a separate JIRA to track being able to change our clock implementation, and we can evaluate this on its own merits (might be nice to run a soak test for a few hours here to make sure that there are no weird edge cases that somehow got broken). Sound good? Zookeeper should be tolerant of clock adjustments - Key: ZOOKEEPER-1366 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1366 Project: ZooKeeper Issue Type: Bug Reporter: Ted Dunning Assignee: Ted Dunning Fix For: 3.4.3 Attachments: ZOOKEEPER-1366-3.3.3.patch, ZOOKEEPER-1366.patch, ZOOKEEPER-1366.patch, ZOOKEEPER-1366.patch, ZOOKEEPER-1366.patch If you want to wreak havoc on a ZK based system just do [date -s +1hour] and watch the mayhem as all sessions expire at once. This shouldn't happen. Zookeeper could easily know handle elapsed times as elapsed times rather than as differences between absolute times. The absolute times are subject to adjustment when the clock is set while a timer is not subject to this problem. In Java, System.currentTimeMillis() gives you absolute time while System.nanoTime() gives you time based on a timer from an arbitrary epoch. I have done this and have been running tests now for some tens of minutes with no failures. I will set up a test machine to redo the build again on Ubuntu and post a patch here for discussion. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (ZOOKEEPER-1366) Zookeeper should be tolerant of clock adjustments
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13191384#comment-13191384 ] Mahadev konar commented on ZOOKEEPER-1366: -- @Ted, Seems like a good change, only one issue I see here. I'd like this to go into trunk and not into 3.4 unless its really a bug. I think 3.4 will take sometime to stabilize and would really like to avoid big changes in 3.4. Thoughts? Zookeeper should be tolerant of clock adjustments - Key: ZOOKEEPER-1366 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1366 Project: ZooKeeper Issue Type: Bug Reporter: Ted Dunning Assignee: Ted Dunning Fix For: 3.4.3 Attachments: ZOOKEEPER-1366-3.3.3.patch, ZOOKEEPER-1366.patch, ZOOKEEPER-1366.patch, ZOOKEEPER-1366.patch, ZOOKEEPER-1366.patch If you want to wreak havoc on a ZK based system just do [date -s +1hour] and watch the mayhem as all sessions expire at once. This shouldn't happen. Zookeeper could easily know handle elapsed times as elapsed times rather than as differences between absolute times. The absolute times are subject to adjustment when the clock is set while a timer is not subject to this problem. In Java, System.currentTimeMillis() gives you absolute time while System.nanoTime() gives you time based on a timer from an arbitrary epoch. I have done this and have been running tests now for some tens of minutes with no failures. I will set up a test machine to redo the build again on Ubuntu and post a patch here for discussion. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (ZOOKEEPER-1359) ZkCli create command data and acl parts should be optional.
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Hunt updated ZOOKEEPER-1359: Fix Version/s: 3.4.3 Issue Type: Bug (was: Improvement) ZkCli create command data and acl parts should be optional. --- Key: ZOOKEEPER-1359 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1359 Project: ZooKeeper Issue Type: Bug Components: java client Reporter: kavita sharma Assignee: kavita sharma Priority: Trivial Labels: new Fix For: 3.4.3, 3.5.0 In zkCli if we create a node without data then also node is getting created but if we will see in the commandMap it shows that {noformat} commandMap.put(create, [-s] [-e] path data acl); {noformat} that means data and acl parts are not optional .we need to change these parts as optional. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (ZOOKEEPER-747) Add C# generation to Jute
[ https://issues.apache.org/jira/browse/ZOOKEEPER-747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13191391#comment-13191391 ] Patrick Hunt commented on ZOOKEEPER-747: Botond please enter a new jira for this issue. Thanks. Add C# generation to Jute - Key: ZOOKEEPER-747 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-747 Project: ZooKeeper Issue Type: New Feature Components: jute Reporter: Eric Hauser Assignee: Eric Hauser Fix For: 3.4.0 Attachments: ZOOKEEPER-747.patch The following patch adds a new language, C#, to the Jute code generation. The code that is generated does have a dependency on a third party library, Jon Skeet's MiscUtil, which is Apache licensed. The library is necessary because C# does not provide big endian support in the base class libraries. As none of the existing Jute code has any unit tests, I have not added tests for this patch. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (ZOOKEEPER-1366) Zookeeper should be tolerant of clock adjustments
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13191396#comment-13191396 ] Camille Fournier commented on ZOOKEEPER-1366: - @Henry: I am fine with doing it as a separate ticket. I do think it's pretty trivial to rework this and get ourselves far down the road with a non-static impl, and not sure that we need to address Thread.sleep() to get a lot of mileage out of the solution. But I don't think I'll have time to rework this patch to do that so might as well do it in a separate ticket if Ted doesn't want to worry about that. Zookeeper should be tolerant of clock adjustments - Key: ZOOKEEPER-1366 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1366 Project: ZooKeeper Issue Type: Bug Reporter: Ted Dunning Assignee: Ted Dunning Fix For: 3.4.3 Attachments: ZOOKEEPER-1366-3.3.3.patch, ZOOKEEPER-1366.patch, ZOOKEEPER-1366.patch, ZOOKEEPER-1366.patch, ZOOKEEPER-1366.patch If you want to wreak havoc on a ZK based system just do [date -s +1hour] and watch the mayhem as all sessions expire at once. This shouldn't happen. Zookeeper could easily know handle elapsed times as elapsed times rather than as differences between absolute times. The absolute times are subject to adjustment when the clock is set while a timer is not subject to this problem. In Java, System.currentTimeMillis() gives you absolute time while System.nanoTime() gives you time based on a timer from an arbitrary epoch. I have done this and have been running tests now for some tens of minutes with no failures. I will set up a test machine to redo the build again on Ubuntu and post a patch here for discussion. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (ZOOKEEPER-1364) Add orthogonal fault injection mechanism/framework
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13191402#comment-13191402 ] Patrick Hunt commented on ZOOKEEPER-1364: - See this prior related discussion: http://markmail.org/message/ysymrofslp2opqei A while back I had used aspectj to do a one-off for this. Lost track of it since. However the basic idea was to using pointcuts on the network/filesystem read/write operations in order to introduce failures. Worked well (for example the quorum would be lost and then re-established, much error handling code was exercised. I had to monitor it by hand though and ensure that things were functioning as expected. I found at least 3 or 4 issues with this approach. Add orthogonal fault injection mechanism/framework -- Key: ZOOKEEPER-1364 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1364 Project: ZooKeeper Issue Type: Test Components: tests Reporter: Andrei Savu Assignee: Andrei Savu Hadoop has a mechanism for doing fault injection (HDFS-435). I think it would be useful if something similar would be available for ZooKeeper. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (ZOOKEEPER-1366) Zookeeper should be tolerant of clock adjustments
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13191403#comment-13191403 ] Ted Dunning commented on ZOOKEEPER-1366: @Mahadev, I will produce patches for several versions as well as pre-built tar files. This will include trunk, 3.4, 3.3.3 and 3.3.2. The rate at which we encounter this kind of problem even at facilities that are pretty sophisticated says to me that even if this isn't strictly a bug, then it seriously decreases the probability that ZK will function well averaged over all plausible users. Developers and users then think ZK is seriously buggy. As such, we view this internally as a required fix that we will be deploying as a patch on our current production version of ZK. Whether the ZK community views it that way relative to 3.4 is an entirely separate question, of course. @Pat, @Henry, I really do think that doing the simple small thing (this patch) is important without waiting on the resolution of the larger more comprehensive fix (fixing time management in general). Thanks for your efforts in that vein. @Camille, I will have a slightly revised set of patches by tonight that fix the @Test issue. I also think that mocking was not at all the point of this patch. The only way that this makes mocking easier is that we now have a (slightly) more distinctive string to search for but as you point out, that isn't the real issue anyway. Mocking static functions is becoming a more common capability, but I would rather do it right when we do it. Zookeeper should be tolerant of clock adjustments - Key: ZOOKEEPER-1366 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1366 Project: ZooKeeper Issue Type: Bug Reporter: Ted Dunning Assignee: Ted Dunning Fix For: 3.4.3 Attachments: ZOOKEEPER-1366-3.3.3.patch, ZOOKEEPER-1366.patch, ZOOKEEPER-1366.patch, ZOOKEEPER-1366.patch, ZOOKEEPER-1366.patch If you want to wreak havoc on a ZK based system just do [date -s +1hour] and watch the mayhem as all sessions expire at once. This shouldn't happen. Zookeeper could easily know handle elapsed times as elapsed times rather than as differences between absolute times. The absolute times are subject to adjustment when the clock is set while a timer is not subject to this problem. In Java, System.currentTimeMillis() gives you absolute time while System.nanoTime() gives you time based on a timer from an arbitrary epoch. I have done this and have been running tests now for some tens of minutes with no failures. I will set up a test machine to redo the build again on Ubuntu and post a patch here for discussion. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (ZOOKEEPER-1077) C client lib doesn't build on Solairs
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1077?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Hunt reassigned ZOOKEEPER-1077: --- Assignee: Tadeusz Andrzej Kadłubowski C client lib doesn't build on Solairs - Key: ZOOKEEPER-1077 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1077 Project: ZooKeeper Issue Type: Bug Components: build, c client Affects Versions: 3.3.4 Environment: uname -a: SunOS [redacted] 5.10 Generic_142910-17 i86pc i386 i86pc GNU toolchain (gcc 3.4.3, GNU Make etc.) Reporter: Tadeusz Andrzej Kadłubowski Assignee: Tadeusz Andrzej Kadłubowski Priority: Minor Attachments: zookeeper.patch Hello, Some minor trouble with building ZooKeeper C client library on Sun^H^H^HOracle Solaris 5.10. 1. You need to link against -lnsl -lsocket 2. ctime_r needs a buffer size. The signature is: char *ctime_r(const time_t *clock, char *buf, int buflen) 3. In zk_log.c you need to manually cast pid_t to int (-Werror can be cumbersome ;) ) 4. getpwuid_r()returns pointer to struct passwd, which works as the last parameter on Linux. Solaris signature: struct passwd *getpwuid_r(uid_t uid, struct passwd *pwd, char *buffer, int buflen); Linux signature: int getpwuid_r(uid_t uid, struct passwd *pwd, char *buf, size_t buflen, struct passwd **result); -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (ZOOKEEPER-1077) C client lib doesn't build on Solairs
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1077?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Hunt reassigned ZOOKEEPER-1077: --- Assignee: Justin SB (was: Tadeusz Andrzej Kadłubowski) C client lib doesn't build on Solairs - Key: ZOOKEEPER-1077 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1077 Project: ZooKeeper Issue Type: Bug Components: build, c client Affects Versions: 3.3.4 Environment: uname -a: SunOS [redacted] 5.10 Generic_142910-17 i86pc i386 i86pc GNU toolchain (gcc 3.4.3, GNU Make etc.) Reporter: Tadeusz Andrzej Kadłubowski Assignee: Justin SB Priority: Minor Attachments: zookeeper.patch Hello, Some minor trouble with building ZooKeeper C client library on Sun^H^H^HOracle Solaris 5.10. 1. You need to link against -lnsl -lsocket 2. ctime_r needs a buffer size. The signature is: char *ctime_r(const time_t *clock, char *buf, int buflen) 3. In zk_log.c you need to manually cast pid_t to int (-Werror can be cumbersome ;) ) 4. getpwuid_r()returns pointer to struct passwd, which works as the last parameter on Linux. Solaris signature: struct passwd *getpwuid_r(uid_t uid, struct passwd *pwd, char *buffer, int buflen); Linux signature: int getpwuid_r(uid_t uid, struct passwd *pwd, char *buf, size_t buflen, struct passwd **result); -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
Failed: ZOOKEEPER-1309 PreCommit Build #912
Jira: https://issues.apache.org/jira/browse/ZOOKEEPER-1309 Build: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/912/ ### ## LAST 60 LINES OF THE CONSOLE ### Started by user phunt Building remotely on hadoop9 Reverting /home/jenkins/jenkins-slave/workspace/PreCommit-ZOOKEEPER-Build/trunk Updating http://svn.apache.org/repos/asf/zookeeper/trunk At revision 1234963 no change for http://svn.apache.org/repos/asf/zookeeper/trunk since the previous build No emails were triggered. [PreCommit-ZOOKEEPER-Build] $ /bin/bash /tmp/hudson6518498350817341326.sh /home/jenkins/tools/java/latest/bin/java Buildfile: build.xml check-for-findbugs: findbugs.check: forrest.check: hudson-test-patch: [exec] [exec] [exec] == [exec] == [exec] Testing patch for ZOOKEEPER-1309. [exec] == [exec] == [exec] [exec] [exec] At revision 1234963. [exec] ZOOKEEPER-1309 is not Patch Available. Exiting. [exec] [exec] [exec] == [exec] == [exec] Finished build. [exec] == [exec] == [exec] [exec] BUILD SUCCESSFUL Total time: 2 seconds Archiving artifacts ERROR: No artifacts found that match the file pattern trunk/build/test/findbugs/newPatchFindbugsWarnings.html,trunk/patchprocess/*.txt,trunk/patchprocess/*Warnings.xml,trunk/build/test/test-cppunit/*.txt,trunk/build/tmp/zk.log. Configuration error? ERROR: 'trunk/build/test/findbugs/newPatchFindbugsWarnings.html' doesn't match anything: 'trunk' exists but not 'trunk/build/test/findbugs/newPatchFindbugsWarnings.html' Build step 'Archive the artifacts' changed build result to FAILURE Recording test results Description set: ZOOKEEPER-1309 Email was triggered for: Failure Sending email for trigger: Failure ### ## FAILED TESTS (if any) ## No tests ran.
[jira] [Commented] (ZOOKEEPER-1178) Add eclipse target for supporting Apache IvyDE
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13191448#comment-13191448 ] Patrick Hunt commented on ZOOKEEPER-1178: - I see pig calls it .eclipse.templates while hive calls it eclipse.templates so I doubt we'll get my objections - imo we should rename it to .eclipse.templates (I don't see any objections to my original suggestion either, so we're probably safe). Note - if you use git be sure to use the --no-prefix option when creating the patch. Regards. Add eclipse target for supporting Apache IvyDE -- Key: ZOOKEEPER-1178 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1178 Project: ZooKeeper Issue Type: Improvement Components: build Environment: Mac OS X w/ Eclipse 3.7. However, I believe this will work in any Eclipse environment. Reporter: Warren Turkal Assignee: Warren Turkal Priority: Minor Fix For: 3.5.0 Attachments: ZOOKEEPER-1178.patch Original Estimate: 1h Remaining Estimate: 1h This patch adds support for Eclipse with Apache IvyDE, which is the extension that integrates Ivy support into Eclipse. This allows the creation of what appear to be fully portable .eclipse and .classpath files. I will be posting a patch shortly. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
Failed: ZOOKEEPER-1309 PreCommit Build #913
Jira: https://issues.apache.org/jira/browse/ZOOKEEPER-1309 Build: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/913/ ### ## LAST 60 LINES OF THE CONSOLE ### [...truncated 70 lines...] [exec] patching file src/java/main/org/apache/zookeeper/ClientCnxn.java [exec] Hunk #1 FAILED at 40. [exec] Hunk #2 FAILED at 158. [exec] Hunk #3 FAILED at 694. [exec] Hunk #4 FAILED at 939. [exec] Hunk #5 succeeded at 1238 with fuzz 2 (offset -47 lines). [exec] Hunk #6 FAILED at 1377. [exec] 5 out of 6 hunks FAILED -- saving rejects to file src/java/main/org/apache/zookeeper/ClientCnxn.java.rej [exec] PATCH APPLICATION FAILED [exec] [exec] [exec] [exec] [exec] -1 overall. Here are the results of testing the latest attachment [exec] http://issues.apache.org/jira/secure/attachment/12511557/zk-1309-3.patch [exec] against trunk revision 1234974. [exec] [exec] +1 @author. The patch does not contain any @author tags. [exec] [exec] -1 tests included. The patch doesn't appear to include any new or modified tests. [exec] Please justify why no new tests are needed for this patch. [exec] Also please list what manual steps were performed to verify this patch. [exec] [exec] -1 patch. The patch command could not apply the patch. [exec] [exec] Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/913//console [exec] [exec] This message is automatically generated. [exec] [exec] [exec] == [exec] == [exec] Adding comment to Jira. [exec] == [exec] == [exec] [exec] [exec] Comment added. [exec] gWwd58o43E logged out [exec] [exec] [exec] == [exec] == [exec] Finished build. [exec] == [exec] == [exec] [exec] BUILD FAILED /home/jenkins/jenkins-slave/workspace/PreCommit-ZOOKEEPER-Build/trunk/build.xml:1567: exec returned: 1 Total time: 42 seconds Build step 'Execute shell' marked build as failure Archiving artifacts Recording test results Description set: ZOOKEEPER-1309 Email was triggered for: Failure Sending email for trigger: Failure ### ## FAILED TESTS (if any) ## No tests ran.
[jira] [Commented] (ZOOKEEPER-1367) Data inconsistencies and unexpired ephemeral nodes after cluster restart
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13191598#comment-13191598 ] Patrick Hunt commented on ZOOKEEPER-1367: - bq. we need to restart all the ZK servers Hm. very interesting. What exactly does this mean? You mentioned earlier that you embed Zookeeper into our application framework and set up things through code how exactly are you performing this restart. Is ZK a separate process, are you killing processes, or are you calling some code to effect this? I ask because we really don't support this and I'm wondering if that could be related. If I wanted to setup a junit test how might I go about doing it? If you could provide some insight it might help in reproducing. bq. I thought ZK didn't care much about wall-clock time that's true, but the more variables we can eliminate the more easy it will be to track the real issue down. bq. I have not tried this. If I can wrangle the QA team into re-running the test, I will give this a try and report back. I suspect that the session is not getting cleared until after restarting the cluster, but this would help to verify (I need to dig into your attachment, planning to look at that next). Data inconsistencies and unexpired ephemeral nodes after cluster restart Key: ZOOKEEPER-1367 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1367 Project: ZooKeeper Issue Type: Bug Components: server Affects Versions: 3.4.2 Environment: Debian Squeeze, 64-bit Reporter: Jeremy Stribling Priority: Blocker Fix For: 3.4.3 Attachments: ZOOKEEPER-1367.tgz In one of our tests, we have a cluster of three ZooKeeper servers. We kill all three, and then restart just two of them. Sometimes we notice that on one of the restarted servers, ephemeral nodes from previous sessions do not get deleted, while on the other server they do. We are effectively running 3.4.2, though technically we are running 3.4.1 with the patch manually applied for ZOOKEEPER-1333 and a C client for 3.4.1 with the patches for ZOOKEEPER-1163. I noticed that when I connected using zkCli.sh to the first node (90.0.0.221, zkid 84), I saw only one znode in a particular path: {quote} [zk: 90.0.0.221:2888(CONNECTED) 0] ls /election/zkrsm [nominee11] [zk: 90.0.0.221:2888(CONNECTED) 1] get /election/zkrsm/nominee11 90.0.0.222: cZxid = 0x40027 ctime = Thu Jan 19 08:18:24 UTC 2012 mZxid = 0x40027 mtime = Thu Jan 19 08:18:24 UTC 2012 pZxid = 0x40027 cversion = 0 dataVersion = 0 aclVersion = 0 ephemeralOwner = 0xa234f4f3bc220001 dataLength = 16 numChildren = 0 {quote} However, when I connect zkCli.sh to the second server (90.0.0.222, zkid 251), I saw three znodes under that same path: {quote} [zk: 90.0.0.222:2888(CONNECTED) 2] ls /election/zkrsm nominee06 nominee10 nominee11 [zk: 90.0.0.222:2888(CONNECTED) 2] get /election/zkrsm/nominee11 90.0.0.222: cZxid = 0x40027 ctime = Thu Jan 19 08:18:24 UTC 2012 mZxid = 0x40027 mtime = Thu Jan 19 08:18:24 UTC 2012 pZxid = 0x40027 cversion = 0 dataVersion = 0 aclVersion = 0 ephemeralOwner = 0xa234f4f3bc220001 dataLength = 16 numChildren = 0 [zk: 90.0.0.222:2888(CONNECTED) 3] get /election/zkrsm/nominee10 90.0.0.221: cZxid = 0x3014c ctime = Thu Jan 19 07:53:42 UTC 2012 mZxid = 0x3014c mtime = Thu Jan 19 07:53:42 UTC 2012 pZxid = 0x3014c cversion = 0 dataVersion = 0 aclVersion = 0 ephemeralOwner = 0xa234f4f3bc22 dataLength = 16 numChildren = 0 [zk: 90.0.0.222:2888(CONNECTED) 4] get /election/zkrsm/nominee06 90.0.0.223: cZxid = 0x20cab ctime = Thu Jan 19 08:00:30 UTC 2012 mZxid = 0x20cab mtime = Thu Jan 19 08:00:30 UTC 2012 pZxid = 0x20cab cversion = 0 dataVersion = 0 aclVersion = 0 ephemeralOwner = 0x5434f5074e040002 dataLength = 16 numChildren = 0 {quote} These never went away for the lifetime of the server, for any clients connected directly to that server. Note that this cluster is configured to have all three servers still, the third one being down (90.0.0.223, zkid 162). I captured the data/snapshot directories for the the two live servers. When I start single-node servers using each directory, I can briefly see that the inconsistent data is present in those logs, though the ephemeral nodes seem to get (correctly) cleaned up pretty soon after I start the server. I will upload a tar containing the debug logs and data directories from the failure. I think we can reproduce it regularly if you need more info. -- This message is automatically generated by JIRA. If you
[jira] [Created] (ZOOKEEPER-1370) Add logging changes in Release Notes needed for clients because of ZOOKEEPER-850.
Add logging changes in Release Notes needed for clients because of ZOOKEEPER-850. - Key: ZOOKEEPER-1370 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1370 Project: ZooKeeper Issue Type: Bug Reporter: Mahadev konar Assignee: Mahadev konar Fix For: 3.4.3 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (ZOOKEEPER-1371) Remove dependency on log4j in the source code.
Remove dependency on log4j in the source code. -- Key: ZOOKEEPER-1371 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1371 Project: ZooKeeper Issue Type: Bug Reporter: Mahadev konar Fix For: 3.5.0 ZOOKEEPER-850 added slf4j to ZK. We still depend on log4j in our codebase. We should remove the dependency on log4j so that we can make logging pluggable. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (ZOOKEEPER-850) Switch from log4j to slf4j
[ https://issues.apache.org/jira/browse/ZOOKEEPER-850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13191691#comment-13191691 ] Jean-Daniel Cryans commented on ZOOKEEPER-850: -- Thanks Mahadev! Switch from log4j to slf4j -- Key: ZOOKEEPER-850 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-850 Project: ZooKeeper Issue Type: Improvement Components: java client Affects Versions: 3.3.1 Reporter: Olaf Krische Assignee: Olaf Krische Fix For: 3.4.0 Attachments: ZOOKEEPER-3.3.1-log4j-slf4j-20101031.patch.bz2, ZOOKEEPER-3.4.0-log4j-slf4j-20101102.patch.bz2, ZOOKEEPER-850.patch, ZOOKEEPER-850.patch, ZOOKEEPER-850.patch, ZOOKEEPER-850.patch Hello, i would like to see slf4j integrated into the zookeeper instead of relying explicitly on log4j. slf4j is an abstract logging framework. There are adapters from slf4j to many logger implementations, one of them is log4j. The decision which log engine to use i dont like to make so early. This would help me to embed zookeeper in my own applications (which use a different logger implemenation, but slf4j is the basis) What do you think? (as i can see, those slf4j request flood all other projects on apache as well :-) Maybe for 3.4 or 4.0? I can offer a patchset, i have experience in such an migration already. :-) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (ZOOKEEPER-1366) Zookeeper should be tolerant of clock adjustments
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13191706#comment-13191706 ] Ted Dunning commented on ZOOKEEPER-1366: @Pat, ZOOKEEPER-366 is essentially the same issue except that the suggested solution is weaker since it doesn't handle the problem of time jumping backwards, nor suggest a method to detect the time jumps. The numerological coincidence on the issue number is quite striking. Zookeeper should be tolerant of clock adjustments - Key: ZOOKEEPER-1366 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1366 Project: ZooKeeper Issue Type: Bug Reporter: Ted Dunning Assignee: Ted Dunning Fix For: 3.4.3 Attachments: ZOOKEEPER-1366-3.3.3.patch, ZOOKEEPER-1366.patch, ZOOKEEPER-1366.patch, ZOOKEEPER-1366.patch, ZOOKEEPER-1366.patch If you want to wreak havoc on a ZK based system just do [date -s +1hour] and watch the mayhem as all sessions expire at once. This shouldn't happen. Zookeeper could easily know handle elapsed times as elapsed times rather than as differences between absolute times. The absolute times are subject to adjustment when the clock is set while a timer is not subject to this problem. In Java, System.currentTimeMillis() gives you absolute time while System.nanoTime() gives you time based on a timer from an arbitrary epoch. I have done this and have been running tests now for some tens of minutes with no failures. I will set up a test machine to redo the build again on Ubuntu and post a patch here for discussion. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (ZOOKEEPER-366) Session timeout detection can go wrong if the leader system time changes
[ https://issues.apache.org/jira/browse/ZOOKEEPER-366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13191711#comment-13191711 ] Ted Dunning commented on ZOOKEEPER-366: --- History repeats itself. I have patches on ZOOKEEPER-1366 that use nanoTime to avoid these problems, but don't include the limit on number of expirations per tick. I suggest that we mark this issue as a duplicate of that issue and apply that patch since it is up to date. Any objections? Session timeout detection can go wrong if the leader system time changes Key: ZOOKEEPER-366 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-366 Project: ZooKeeper Issue Type: Bug Components: quorum, server Reporter: Benjamin Reed Assignee: Benjamin Reed Fix For: 3.5.0 Attachments: ZOOKEEPER-366.patch the leader tracks session expirations by calculating when a session will timeout and then periodically checking to see what needs to be timed out based on the current time. this works great as long as the leaders clock progresses at a steady pace. the problem comes when there are big (session size) changes in clock, by ntp for example. if time gets adjusted forward, all the sessions could timeout immediately. if time goes backward sessions that should timeout may take a lot longer to actually expire. this is really just a leader issue. the easiest way to deal with this is to have the leader relinquish leadership if it detects a big jump forward in time. when a new leader gets elected, it will recalculate timeouts of active sessions. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (ZOOKEEPER-1367) Data inconsistencies and unexpired ephemeral nodes after cluster restart
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13191710#comment-13191710 ] Patrick Hunt commented on ZOOKEEPER-1367: - I just noticed something that doesn't look good: I started a 2 server ensemble, connected a client, then stopped/started the leader. I noticed that: 1) the follower does not create a snapshot after reconnecting to the new leader (the same leader is re-elected to be the leader in epoch 3 as was lead in epoch 2) 2) the follower continues to write changes from epoch 3 into the log from epoch 2. {noformat} java -cp '/home/phunt/Downloads/zk1367/jars/*' org.apache.zookeeper.server.LogFormatter log.20001 ZooKeeper Transactional Log File with dbid 0 txnlog format version 2 1/23/12 4:39:45 PM PST session 0x1350d25cc79 cxid 0x0 zxid 0x20001 createSession 3 1/23/12 4:41:02 PM PST session 0x1350d25cc79 cxid 0x2 zxid 0x30001 create '/foo,#626172,v{s{31,s{'world,'anyone}}},F,1 1/23/12 4:44:10 PM PST session 0x1350d25cc79 cxid 0x4 zxid 0x30002 closeSession null 1/23/12 4:44:26 PM PST session 0x1350d283e69 cxid 0x0 zxid 0x30003 createSession 3 1/23/12 4:44:43 PM PST session 0x1350d283e69 cxid 0x2 zxid 0x30004 closeSession null 1/23/12 4:44:51 PM PST session 0x2350d283e5c cxid 0x0 zxid 0x30005 createSession 3 1/23/12 4:45:10 PM PST session 0x2350d283e5c cxid 0x2 zxid 0x30006 closeSession null 1/23/12 4:45:17 PM PST session 0x1350d283e690001 cxid 0x0 zxid 0x30007 createSession 3 EOF reached after 8 txns. {noformat} Data inconsistencies and unexpired ephemeral nodes after cluster restart Key: ZOOKEEPER-1367 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1367 Project: ZooKeeper Issue Type: Bug Components: server Affects Versions: 3.4.2 Environment: Debian Squeeze, 64-bit Reporter: Jeremy Stribling Priority: Blocker Fix For: 3.4.3 Attachments: ZOOKEEPER-1367.tgz In one of our tests, we have a cluster of three ZooKeeper servers. We kill all three, and then restart just two of them. Sometimes we notice that on one of the restarted servers, ephemeral nodes from previous sessions do not get deleted, while on the other server they do. We are effectively running 3.4.2, though technically we are running 3.4.1 with the patch manually applied for ZOOKEEPER-1333 and a C client for 3.4.1 with the patches for ZOOKEEPER-1163. I noticed that when I connected using zkCli.sh to the first node (90.0.0.221, zkid 84), I saw only one znode in a particular path: {quote} [zk: 90.0.0.221:2888(CONNECTED) 0] ls /election/zkrsm [nominee11] [zk: 90.0.0.221:2888(CONNECTED) 1] get /election/zkrsm/nominee11 90.0.0.222: cZxid = 0x40027 ctime = Thu Jan 19 08:18:24 UTC 2012 mZxid = 0x40027 mtime = Thu Jan 19 08:18:24 UTC 2012 pZxid = 0x40027 cversion = 0 dataVersion = 0 aclVersion = 0 ephemeralOwner = 0xa234f4f3bc220001 dataLength = 16 numChildren = 0 {quote} However, when I connect zkCli.sh to the second server (90.0.0.222, zkid 251), I saw three znodes under that same path: {quote} [zk: 90.0.0.222:2888(CONNECTED) 2] ls /election/zkrsm nominee06 nominee10 nominee11 [zk: 90.0.0.222:2888(CONNECTED) 2] get /election/zkrsm/nominee11 90.0.0.222: cZxid = 0x40027 ctime = Thu Jan 19 08:18:24 UTC 2012 mZxid = 0x40027 mtime = Thu Jan 19 08:18:24 UTC 2012 pZxid = 0x40027 cversion = 0 dataVersion = 0 aclVersion = 0 ephemeralOwner = 0xa234f4f3bc220001 dataLength = 16 numChildren = 0 [zk: 90.0.0.222:2888(CONNECTED) 3] get /election/zkrsm/nominee10 90.0.0.221: cZxid = 0x3014c ctime = Thu Jan 19 07:53:42 UTC 2012 mZxid = 0x3014c mtime = Thu Jan 19 07:53:42 UTC 2012 pZxid = 0x3014c cversion = 0 dataVersion = 0 aclVersion = 0 ephemeralOwner = 0xa234f4f3bc22 dataLength = 16 numChildren = 0 [zk: 90.0.0.222:2888(CONNECTED) 4] get /election/zkrsm/nominee06 90.0.0.223: cZxid = 0x20cab ctime = Thu Jan 19 08:00:30 UTC 2012 mZxid = 0x20cab mtime = Thu Jan 19 08:00:30 UTC 2012 pZxid = 0x20cab cversion = 0 dataVersion = 0 aclVersion = 0 ephemeralOwner = 0x5434f5074e040002 dataLength = 16 numChildren = 0 {quote} These never went away for the lifetime of the server, for any clients connected directly to that server. Note that this cluster is configured to have all three servers still, the third one being down (90.0.0.223, zkid 162). I captured the data/snapshot directories for the the two live servers. When I start single-node servers using each directory, I can briefly see
[jira] [Created] (ZOOKEEPER-1372) stat reports inconsistent zxids across servers after a leader change
stat reports inconsistent zxids across servers after a leader change Key: ZOOKEEPER-1372 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1372 Project: ZooKeeper Issue Type: Bug Components: quorum Affects Versions: 3.4.2 Reporter: Patrick Hunt I started a 2 server ensemble, made some changes to znodes, then shutdown the cluster. I then removed the datadir from the original leader. I then restarted the entire ensemble. after this the new leader has a zxid of 0x4 while the follower reported a zxid of 0x30007 (the last zxid of the old epoch). This was via stat. I then connected a client to the ensemble, subsequent to which the zxid was again in sync. The data all seemed fine, but stat was reporting invalid information until a client connected. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (ZOOKEEPER-1367) Data inconsistencies and unexpired ephemeral nodes after cluster restart
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13191797#comment-13191797 ] Patrick Hunt commented on ZOOKEEPER-1367: - fwiw I did file ZOOKEEPER-1372 which does seems like a valid issue, although not related to this afaik Data inconsistencies and unexpired ephemeral nodes after cluster restart Key: ZOOKEEPER-1367 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1367 Project: ZooKeeper Issue Type: Bug Components: server Affects Versions: 3.4.2 Environment: Debian Squeeze, 64-bit Reporter: Jeremy Stribling Priority: Blocker Fix For: 3.4.3 Attachments: ZOOKEEPER-1367.tgz In one of our tests, we have a cluster of three ZooKeeper servers. We kill all three, and then restart just two of them. Sometimes we notice that on one of the restarted servers, ephemeral nodes from previous sessions do not get deleted, while on the other server they do. We are effectively running 3.4.2, though technically we are running 3.4.1 with the patch manually applied for ZOOKEEPER-1333 and a C client for 3.4.1 with the patches for ZOOKEEPER-1163. I noticed that when I connected using zkCli.sh to the first node (90.0.0.221, zkid 84), I saw only one znode in a particular path: {quote} [zk: 90.0.0.221:2888(CONNECTED) 0] ls /election/zkrsm [nominee11] [zk: 90.0.0.221:2888(CONNECTED) 1] get /election/zkrsm/nominee11 90.0.0.222: cZxid = 0x40027 ctime = Thu Jan 19 08:18:24 UTC 2012 mZxid = 0x40027 mtime = Thu Jan 19 08:18:24 UTC 2012 pZxid = 0x40027 cversion = 0 dataVersion = 0 aclVersion = 0 ephemeralOwner = 0xa234f4f3bc220001 dataLength = 16 numChildren = 0 {quote} However, when I connect zkCli.sh to the second server (90.0.0.222, zkid 251), I saw three znodes under that same path: {quote} [zk: 90.0.0.222:2888(CONNECTED) 2] ls /election/zkrsm nominee06 nominee10 nominee11 [zk: 90.0.0.222:2888(CONNECTED) 2] get /election/zkrsm/nominee11 90.0.0.222: cZxid = 0x40027 ctime = Thu Jan 19 08:18:24 UTC 2012 mZxid = 0x40027 mtime = Thu Jan 19 08:18:24 UTC 2012 pZxid = 0x40027 cversion = 0 dataVersion = 0 aclVersion = 0 ephemeralOwner = 0xa234f4f3bc220001 dataLength = 16 numChildren = 0 [zk: 90.0.0.222:2888(CONNECTED) 3] get /election/zkrsm/nominee10 90.0.0.221: cZxid = 0x3014c ctime = Thu Jan 19 07:53:42 UTC 2012 mZxid = 0x3014c mtime = Thu Jan 19 07:53:42 UTC 2012 pZxid = 0x3014c cversion = 0 dataVersion = 0 aclVersion = 0 ephemeralOwner = 0xa234f4f3bc22 dataLength = 16 numChildren = 0 [zk: 90.0.0.222:2888(CONNECTED) 4] get /election/zkrsm/nominee06 90.0.0.223: cZxid = 0x20cab ctime = Thu Jan 19 08:00:30 UTC 2012 mZxid = 0x20cab mtime = Thu Jan 19 08:00:30 UTC 2012 pZxid = 0x20cab cversion = 0 dataVersion = 0 aclVersion = 0 ephemeralOwner = 0x5434f5074e040002 dataLength = 16 numChildren = 0 {quote} These never went away for the lifetime of the server, for any clients connected directly to that server. Note that this cluster is configured to have all three servers still, the third one being down (90.0.0.223, zkid 162). I captured the data/snapshot directories for the the two live servers. When I start single-node servers using each directory, I can briefly see that the inconsistent data is present in those logs, though the ephemeral nodes seem to get (correctly) cleaned up pretty soon after I start the server. I will upload a tar containing the debug logs and data directories from the failure. I think we can reproduce it regularly if you need more info. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
Failed: ZOOKEEPER-1355 PreCommit Build #914
Jira: https://issues.apache.org/jira/browse/ZOOKEEPER-1355 Build: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/914/ ### ## LAST 60 LINES OF THE CONSOLE ### [...truncated 145571 lines...] [exec] [exec] [exec] [exec] -1 overall. Here are the results of testing the latest attachment [exec] http://issues.apache.org/jira/secure/attachment/12511608/ZOOKEEPER-1355-ver4.patch [exec] against trunk revision 1234974. [exec] [exec] +1 @author. The patch does not contain any @author tags. [exec] [exec] +1 tests included. The patch appears to include 3 new or modified tests. [exec] [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] [exec] -1 findbugs. The patch appears to introduce 1 new Findbugs (version 1.3.9) warnings. [exec] [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings. [exec] [exec] +1 core tests. The patch passed core unit tests. [exec] [exec] +1 contrib tests. The patch passed contrib unit tests. [exec] [exec] Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/914//testReport/ [exec] Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/914//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html [exec] Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/914//console [exec] [exec] This message is automatically generated. [exec] [exec] [exec] == [exec] == [exec] Adding comment to Jira. [exec] == [exec] == [exec] [exec] [exec] Comment added. [exec] r9064x7b6v logged out [exec] [exec] [exec] == [exec] == [exec] Finished build. [exec] == [exec] == [exec] [exec] BUILD FAILED /home/jenkins/jenkins-slave/workspace/PreCommit-ZOOKEEPER-Build/trunk/build.xml:1567: exec returned: 1 Total time: 25 minutes 52 seconds Build step 'Execute shell' marked build as failure Archiving artifacts Recording test results Description set: ZOOKEEPER-1355 Email was triggered for: Failure Sending email for trigger: Failure ### ## FAILED TESTS (if any) ## All tests passed
[jira] [Commented] (ZOOKEEPER-1355) Add zk.updateServerList(newServerList)
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13191802#comment-13191802 ] Hadoop QA commented on ZOOKEEPER-1355: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12511608/ZOOKEEPER-1355-ver4.patch against trunk revision 1234974. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 1 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/914//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/914//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/914//console This message is automatically generated. Add zk.updateServerList(newServerList) --- Key: ZOOKEEPER-1355 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1355 Project: ZooKeeper Issue Type: New Feature Components: java client Reporter: Alexander Shraer Assignee: Alexander Shraer Fix For: 3.5.0 Attachments: ZOOKEEPER-1355-ver2.patch, ZOOKEEPER-1355-ver4.patch, ZOOKEEPER=1355-ver3.patch, ZOOOKEEPER-1355-test.patch, ZOOOKEEPER-1355-ver1.patch, ZOOOKEEPER-1355.patch, loadbalancing-more-details.pdf, loadbalancing.pdf When the set of servers changes, we would like to update the server list stored by clients without restarting the clients. Moreover, assuming that the number of clients per server is the same (in expectation) in the old configuration (as guaranteed by the current list shuffling for example), we would like to re-balance client connections across the new set of servers in a way that a) the number of clients per server is the same for all servers (in expectation) and b) there is no excessive/unnecessary client migration. It is simple to achieve (a) without (b) - just re-shuffle the new list of servers at every client. But this would create unnecessary migration, which we'd like to avoid. We propose a simple probabilistic migration scheme that achieves (a) and (b) - each client locally decides whether and where to migrate when the list of servers changes. The attached document describes the scheme and shows an evaluation of it in Zookeeper. We also implemented re-balancing through a consistent-hashing scheme and show a comparison. We derived the probabilistic migration rules from a simple formula that we can also provide, if someone's interested in the proof. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (ZOOKEEPER-1355) Add zk.updateServerList(newServerList)
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexander Shraer updated ZOOKEEPER-1355: Attachment: ZOOKEEPER-1355-ver5.patch Add zk.updateServerList(newServerList) --- Key: ZOOKEEPER-1355 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1355 Project: ZooKeeper Issue Type: New Feature Components: java client Reporter: Alexander Shraer Assignee: Alexander Shraer Fix For: 3.5.0 Attachments: ZOOKEEPER-1355-ver2.patch, ZOOKEEPER-1355-ver4.patch, ZOOKEEPER-1355-ver5.patch, ZOOKEEPER=1355-ver3.patch, ZOOOKEEPER-1355-test.patch, ZOOOKEEPER-1355-ver1.patch, ZOOOKEEPER-1355.patch, loadbalancing-more-details.pdf, loadbalancing.pdf When the set of servers changes, we would like to update the server list stored by clients without restarting the clients. Moreover, assuming that the number of clients per server is the same (in expectation) in the old configuration (as guaranteed by the current list shuffling for example), we would like to re-balance client connections across the new set of servers in a way that a) the number of clients per server is the same for all servers (in expectation) and b) there is no excessive/unnecessary client migration. It is simple to achieve (a) without (b) - just re-shuffle the new list of servers at every client. But this would create unnecessary migration, which we'd like to avoid. We propose a simple probabilistic migration scheme that achieves (a) and (b) - each client locally decides whether and where to migrate when the list of servers changes. The attached document describes the scheme and shows an evaluation of it in Zookeeper. We also implemented re-balancing through a consistent-hashing scheme and show a comparison. We derived the probabilistic migration rules from a simple formula that we can also provide, if someone's interested in the proof. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (ZOOKEEPER-1355) Add zk.updateServerList(newServerList)
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13191826#comment-13191826 ] Hadoop QA commented on ZOOKEEPER-1355: -- +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12511612/ZOOKEEPER-1355-ver5.patch against trunk revision 1234974. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/915//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/915//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/915//console This message is automatically generated. Add zk.updateServerList(newServerList) --- Key: ZOOKEEPER-1355 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1355 Project: ZooKeeper Issue Type: New Feature Components: java client Reporter: Alexander Shraer Assignee: Alexander Shraer Fix For: 3.5.0 Attachments: ZOOKEEPER-1355-ver2.patch, ZOOKEEPER-1355-ver4.patch, ZOOKEEPER-1355-ver5.patch, ZOOKEEPER=1355-ver3.patch, ZOOOKEEPER-1355-test.patch, ZOOOKEEPER-1355-ver1.patch, ZOOOKEEPER-1355.patch, loadbalancing-more-details.pdf, loadbalancing.pdf When the set of servers changes, we would like to update the server list stored by clients without restarting the clients. Moreover, assuming that the number of clients per server is the same (in expectation) in the old configuration (as guaranteed by the current list shuffling for example), we would like to re-balance client connections across the new set of servers in a way that a) the number of clients per server is the same for all servers (in expectation) and b) there is no excessive/unnecessary client migration. It is simple to achieve (a) without (b) - just re-shuffle the new list of servers at every client. But this would create unnecessary migration, which we'd like to avoid. We propose a simple probabilistic migration scheme that achieves (a) and (b) - each client locally decides whether and where to migrate when the list of servers changes. The attached document describes the scheme and shows an evaluation of it in Zookeeper. We also implemented re-balancing through a consistent-hashing scheme and show a comparison. We derived the probabilistic migration rules from a simple formula that we can also provide, if someone's interested in the proof. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
Success: ZOOKEEPER-1355 PreCommit Build #915
Jira: https://issues.apache.org/jira/browse/ZOOKEEPER-1355 Build: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/915/ ### ## LAST 60 LINES OF THE CONSOLE ### [...truncated 150273 lines...] [exec] BUILD SUCCESSFUL [exec] Total time: 0 seconds [exec] [exec] [exec] [exec] [exec] +1 overall. Here are the results of testing the latest attachment [exec] http://issues.apache.org/jira/secure/attachment/12511612/ZOOKEEPER-1355-ver5.patch [exec] against trunk revision 1234974. [exec] [exec] +1 @author. The patch does not contain any @author tags. [exec] [exec] +1 tests included. The patch appears to include 3 new or modified tests. [exec] [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] [exec] +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. [exec] [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings. [exec] [exec] +1 core tests. The patch passed core unit tests. [exec] [exec] +1 contrib tests. The patch passed contrib unit tests. [exec] [exec] Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/915//testReport/ [exec] Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/915//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html [exec] Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/915//console [exec] [exec] This message is automatically generated. [exec] [exec] [exec] == [exec] == [exec] Adding comment to Jira. [exec] == [exec] == [exec] [exec] [exec] Comment added. [exec] X9U36j45RN logged out [exec] [exec] [exec] == [exec] == [exec] Finished build. [exec] == [exec] == [exec] [exec] BUILD SUCCESSFUL Total time: 26 minutes 5 seconds Archiving artifacts Recording test results Description set: ZOOKEEPER-1355 Email was triggered for: Success Sending email for trigger: Success ### ## FAILED TESTS (if any) ## All tests passed
[jira] [Created] (ZOOKEEPER-1373) Hardcoded SASL login context name clashes with Hadoop security configuration override
Hardcoded SASL login context name clashes with Hadoop security configuration override - Key: ZOOKEEPER-1373 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1373 Project: ZooKeeper Issue Type: Bug Components: java client Affects Versions: 3.4.2 Reporter: Thomas Weise I'm trying to configure a process with Hadoop security (Hive metastore server) to talk to ZooKeeper 3.4.2 with Kerberos authentication. In this scenario Hadoop controls the SASL configuration (org.apache.hadoop.security.UserGroupInformation.HadoopConfiguration), instead of setting up the ZooKeeper Client loginContext via jaas.conf and system property {{-Djava.security.auth.login.config}} Using the Hadoop configuration would work, except that ZooKeeper client code expects the loginContextName to be Client while Hadoop security will use hadoop-keytab-kerberos. I verified that by changing the name in the debugger the SASL authentication succeeds while otherwise the login configuration cannot be resolved and the connection to ZooKeeper is unauthenticated. To integrate with Hadoop, the following in ZooKeeperSaslClient would need to change to make the name configurable: {{login = new Login(Client,new ClientCallbackHandler(null));}} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (ZOOKEEPER-1373) Hardcoded SASL login context name clashes with Hadoop security configuration override
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13191858#comment-13191858 ] Mahadev konar commented on ZOOKEEPER-1373: -- This is a bug. We should fix to have a login context name configurable. Hardcoded SASL login context name clashes with Hadoop security configuration override - Key: ZOOKEEPER-1373 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1373 Project: ZooKeeper Issue Type: Bug Components: java client Affects Versions: 3.4.2 Reporter: Thomas Weise Fix For: 3.4.3 I'm trying to configure a process with Hadoop security (Hive metastore server) to talk to ZooKeeper 3.4.2 with Kerberos authentication. In this scenario Hadoop controls the SASL configuration (org.apache.hadoop.security.UserGroupInformation.HadoopConfiguration), instead of setting up the ZooKeeper Client loginContext via jaas.conf and system property {{-Djava.security.auth.login.config}} Using the Hadoop configuration would work, except that ZooKeeper client code expects the loginContextName to be Client while Hadoop security will use hadoop-keytab-kerberos. I verified that by changing the name in the debugger the SASL authentication succeeds while otherwise the login configuration cannot be resolved and the connection to ZooKeeper is unauthenticated. To integrate with Hadoop, the following in ZooKeeperSaslClient would need to change to make the name configurable: {{login = new Login(Client,new ClientCallbackHandler(null));}} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
ZooKeeper User Group and upcoming meetup.
Hi folks, As discussed on the list, we had talked about having regular meetups for users and devs where we could talk about user issues, roadmaps and releases. We will be hosting our first user group meetup at Yahoo! (thanks to Ben) on Feb 10. The details are still under works but here is the meetup event for you to RSVP. http://www.meetup.com/zookeeperusergroup/events/49372662/ We'll be updating the agenda for the meetup soon. Hope to see you at the meetup. thanks mahadev