[jira] [Comment Edited] (ZOOKEEPER-2152) Intermittent failure in TestReconfig.cc

2016-07-12 Thread Flavio Junqueira (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15374340#comment-15374340
 ] 

Flavio Junqueira edited comment on ZOOKEEPER-2152 at 7/13/16 5:12 AM:
--

It'd be good if we could be more careful about adding system tests vs. unit 
tests. The running time of our tests is already pretty long and it'd be great 
if we could cut on that. 


was (Author: fpj):
I'd like to suggest that we are careful about adding system tests vs. unit 
tests. The running time of our tests is already pretty long and it'd be great 
if we could cut on that. 

> Intermittent failure in TestReconfig.cc
> ---
>
> Key: ZOOKEEPER-2152
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2152
> Project: ZooKeeper
>  Issue Type: Sub-task
>  Components: c client
>Reporter: Michi Mutsuzaki
>Assignee: Michael Han
>  Labels: reconfiguration
> Fix For: 3.5.3, 3.6.0
>
> Attachments: ZOOKEEPER-2152.patch
>
>
> I'm seeing this failure in the c client test once in a while:
> {noformat}
> [exec] 
> /home/jenkins/jenkins-slave/workspace/ZooKeeper-trunk/trunk/src/c/tests/TestReconfig.cc:474:
>  Assertion: assertion failed [Expression: found != string::npos, 
> 10.10.10.4:2004 not in newComing list]
> {noformat}
> https://builds.apache.org/job/ZooKeeper-trunk/2640/console



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: unexpected server response ZRUNTIMEINCONSISTENCY

2016-07-12 Thread Flavio Junqueira
Yeah, it looks like you're getting an unexpected xid, out of order or one 
response got lost somehow. As Pat says, it would be great to have a way of 
reproducing the problem.

-Flavio

> On 28 Jun 2016, at 22:55, Pradeepa Kumar  wrote:
> 
> Thanks
> version is
> Zookeeper version: 3.5.1-alpha--1, built on 05/17/2016 14:16 GMT
> 
> sorry i dont have any test.
> when we get ZRUNTIMEINCONSISTENCY return code, does that mean connection is
> closed and there is need to reestablish connection to server?
> 
> 
> On Tue, Jun 28, 2016 at 11:57 PM, Patrick Hunt  wrote:
> 
>> What version(s) are you seeing this against? (I put similar request in to
>> the jira https://issues.apache.org/jira/browse/ZOOKEEPER-2455)
>> 
>> Is there a small test case that reproduces this issue that you can provide?
>> Logs? (if so please add them to the jira)
>> 
>> Patrick
>> 
>> On Tue, Jun 28, 2016 at 7:33 AM, Jared Cantwell 
>> wrote:
>> 
>>> I can't help you with a solution, but we are also seeing this.  We had
>> not
>>> made the connection to a reconfig causing this, but a quick look at some
>>> logs indicates it may be happening to us during a reconfig as well.  That
>>> gives us a thread to dig more-- we'll update if we find anything.
>>> 
>>> Best,
>>> Jared
>>> 
>>> On Tue, Jun 28, 2016 at 12:03 AM, Pradeepa Kumar 
>>> wrote:
>>> 
 Hi Folks,
 
 I am hitting an error in my C client code and below are the set of
 operations I perform:
 
  1.  Zookeeper Client connected to Zookeeper server S1 and a new
 server S2 gets added.
  2.  monitor zookeeper server config at the client and on change of
 server config, call zoo_set_server
 from the client
  3.  client can issue operations like zoo_get just after the call to
 zoo_set_servers
  4.  I can see that the zookeeper thread logs connect to the new
 server just after the zoo_get
 call
 
 2016-04-11 03:46:50,655:1207(0xf26ffb40):ZOO_INFO@check_events@2345:
 initiated connection
 to server [128.0.0.5:61728]
 
 2016-04-11 03:46:50,658:1207(0xf26ffb40):ZOO_INFO@check_events@2397:
 session establishment
 complete on server [128.0.0.5:61728], sessionId=0x401852c000c,
 negotiated timeout=2
 
  5.  Some times I find errors like below:
 
 2016-04-11
>>> 03:46:50,662:1207(0xf26ffb40):ZOO_ERROR@handle_socket_error_msg
 @2923:
 Socket [128.0.0.5:61728]
 zk retcode=-2, errno=115(Operation now in progress): unexpected server
 response: expected
 0x570b82fa, but received 0x570b82f9
 
  1.
 zoo_get returns (-2) indicating that
 ZRUNTIMEINCONSISTENCY<
 
>>> 
>> http://zookeeper.sourcearchive.com/documentation/3.2.2plus-pdfsg3/zookeeper_8h_bb1a0a179f313b2e44ee92369c438a4c.html#bb1a0a179f313b2e44ee92369c438a4c9eabb281ab14c74db3aff9ab456fa7fe
> 
 
 
 What is the issue here? should I be retry the operation zoo_get
 operation? Or should I wait
 for the zoo_set_server to complete (like wait for the connection
 establishment notification)
 
 Thanks,
 
>>> 
>> 



[jira] [Commented] (ZOOKEEPER-2152) Intermittent failure in TestReconfig.cc

2016-07-12 Thread Flavio Junqueira (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15374340#comment-15374340
 ] 

Flavio Junqueira commented on ZOOKEEPER-2152:
-

I'd like to suggest that we are careful about adding system tests vs. unit 
tests. The running time of our tests is already pretty long and it'd be great 
if we could cut on that. 

> Intermittent failure in TestReconfig.cc
> ---
>
> Key: ZOOKEEPER-2152
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2152
> Project: ZooKeeper
>  Issue Type: Sub-task
>  Components: c client
>Reporter: Michi Mutsuzaki
>Assignee: Michael Han
>  Labels: reconfiguration
> Fix For: 3.5.3, 3.6.0
>
> Attachments: ZOOKEEPER-2152.patch
>
>
> I'm seeing this failure in the c client test once in a while:
> {noformat}
> [exec] 
> /home/jenkins/jenkins-slave/workspace/ZooKeeper-trunk/trunk/src/c/tests/TestReconfig.cc:474:
>  Assertion: assertion failed [Expression: found != string::npos, 
> 10.10.10.4:2004 not in newComing list]
> {noformat}
> https://builds.apache.org/job/ZooKeeper-trunk/2640/console



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-2152) Intermittent failure in TestReconfig.cc

2016-07-12 Thread Alexander Shraer (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15374306#comment-15374306
 ] 

Alexander Shraer commented on ZOOKEEPER-2152:
-

Sounds good, thanks Michael!

I suspect that when trying to generalize, there may be some dependency with 
ZOOKEEPER-2466 after all. But maybe not.

> Intermittent failure in TestReconfig.cc
> ---
>
> Key: ZOOKEEPER-2152
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2152
> Project: ZooKeeper
>  Issue Type: Sub-task
>  Components: c client
>Reporter: Michi Mutsuzaki
>Assignee: Michael Han
>  Labels: reconfiguration
> Fix For: 3.5.3, 3.6.0
>
> Attachments: ZOOKEEPER-2152.patch
>
>
> I'm seeing this failure in the c client test once in a while:
> {noformat}
> [exec] 
> /home/jenkins/jenkins-slave/workspace/ZooKeeper-trunk/trunk/src/c/tests/TestReconfig.cc:474:
>  Assertion: assertion failed [Expression: found != string::npos, 
> 10.10.10.4:2004 not in newComing list]
> {noformat}
> https://builds.apache.org/job/ZooKeeper-trunk/2640/console



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-2152) Intermittent failure in TestReconfig.cc

2016-07-12 Thread Michael Han (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15374297#comment-15374297
 ] 

Michael Han commented on ZOOKEEPER-2152:


Thanks Alex / Raul for your feedback. Take a step back and rethink this whole 
problem, I am thinking there are two issues we can address:
* These failed C reconfiguration tests was failing because the assumption these 
tests were build on does not hold in first place (in this case, the tests 
assumes the only clients can change the server list). So, for the purpose of 
fixing these tests we can introduce some specific generalized mode that can 
turn on / off certain code path (e.g. disable client retry connection logic to 
make satisfy the test's prerequisites.). This is what the current patch does, 
though it does not provide a generalized flag, which I agree we should fix it.

* Improve the current reconfig C tests. I think the current tests have value in 
that they have good coverage on C client reconfig logic; however the foundation 
/ assumption these tests build on is not very sound - the tests use fake server 
so it does not simulate the real production scenario, and we don't have test 
coverage for both reconfig logic + error handling when certain server failed. 
So we probably should create real ZK servers and have C reconfig tests run 
against those servers.

My plan is to provide a generalized solution to turn on / off C client server 
connection code (for the purpose of testing pure client side logic assumes all 
servers are up.). For the second issue, create a new JIRA to improve the C 
reconfig tests against real servers. Does this sound a reasonable plan to you 
guys? 



> Intermittent failure in TestReconfig.cc
> ---
>
> Key: ZOOKEEPER-2152
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2152
> Project: ZooKeeper
>  Issue Type: Sub-task
>  Components: c client
>Reporter: Michi Mutsuzaki
>Assignee: Michael Han
>  Labels: reconfiguration
> Fix For: 3.5.3, 3.6.0
>
> Attachments: ZOOKEEPER-2152.patch
>
>
> I'm seeing this failure in the c client test once in a while:
> {noformat}
> [exec] 
> /home/jenkins/jenkins-slave/workspace/ZooKeeper-trunk/trunk/src/c/tests/TestReconfig.cc:474:
>  Assertion: assertion failed [Expression: found != string::npos, 
> 10.10.10.4:2004 not in newComing list]
> {noformat}
> https://builds.apache.org/job/ZooKeeper-trunk/2640/console



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-2152) Intermittent failure in TestReconfig.cc

2016-07-12 Thread Raul Gutierrez Segales (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15374126#comment-15374126
 ] 

Raul Gutierrez Segales commented on ZOOKEEPER-2152:
---

Sorry for the drive by review [~hanm] (though I'll look in deeper a bit later 
tonight), but I wanted to second [~shralex]'s comments: better to avoid, if at 
all possible, specific logic just to workaround this test case...

> Intermittent failure in TestReconfig.cc
> ---
>
> Key: ZOOKEEPER-2152
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2152
> Project: ZooKeeper
>  Issue Type: Sub-task
>  Components: c client
>Reporter: Michi Mutsuzaki
>Assignee: Michael Han
>  Labels: reconfiguration
> Fix For: 3.5.3, 3.6.0
>
> Attachments: ZOOKEEPER-2152.patch
>
>
> I'm seeing this failure in the c client test once in a while:
> {noformat}
> [exec] 
> /home/jenkins/jenkins-slave/workspace/ZooKeeper-trunk/trunk/src/c/tests/TestReconfig.cc:474:
>  Assertion: assertion failed [Expression: found != string::npos, 
> 10.10.10.4:2004 not in newComing list]
> {noformat}
> https://builds.apache.org/job/ZooKeeper-trunk/2640/console



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


ZooKeeper_branch34 - Build # 1579 - Still Failing

2016-07-12 Thread Apache Jenkins Server
See https://builds.apache.org/job/ZooKeeper_branch34/1579/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 204468 lines...]
[junit] 2016-07-13 00:37:38,777 [myid:] - INFO  [main:JMXEnv@250] - 
found:StandaloneServer_port 
org.apache.ZooKeeperService:name0=StandaloneServer_port11221
[junit] 2016-07-13 00:37:38,778 [myid:] - INFO  [main:ClientBase@490] - 
STOPPING server
[junit] 2016-07-13 00:37:38,778 [myid:] - INFO  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:11221:NIOServerCnxnFactory@219] - 
NIOServerCnxn factory exited run method
[junit] 2016-07-13 00:37:38,778 [myid:] - INFO  [main:ZooKeeperServer@469] 
- shutting down
[junit] 2016-07-13 00:37:38,778 [myid:] - INFO  
[main:SessionTrackerImpl@225] - Shutting down
[junit] 2016-07-13 00:37:38,778 [myid:] - INFO  
[main:PrepRequestProcessor@765] - Shutting down
[junit] 2016-07-13 00:37:38,779 [myid:] - INFO  
[main:SyncRequestProcessor@209] - Shutting down
[junit] 2016-07-13 00:37:38,779 [myid:] - INFO  [ProcessThread(sid:0 
cport:11221)::PrepRequestProcessor@143] - PrepRequestProcessor exited loop!
[junit] 2016-07-13 00:37:38,779 [myid:] - INFO  
[SyncThread:0:SyncRequestProcessor@187] - SyncRequestProcessor exited!
[junit] 2016-07-13 00:37:38,780 [myid:] - INFO  
[main:FinalRequestProcessor@402] - shutdown of request processor complete
[junit] 2016-07-13 00:37:38,780 [myid:] - INFO  
[main:FourLetterWordMain@62] - connecting to 127.0.0.1 11221
[junit] 2016-07-13 00:37:38,781 [myid:] - INFO  [main:JMXEnv@146] - 
ensureOnly:[]
[junit] 2016-07-13 00:37:38,782 [myid:] - INFO  [main:ClientBase@443] - 
STARTING server
[junit] 2016-07-13 00:37:38,783 [myid:] - INFO  [main:ClientBase@364] - 
CREATING server instance 127.0.0.1:11221
[junit] 2016-07-13 00:37:38,783 [myid:] - INFO  
[main:NIOServerCnxnFactory@89] - binding to port 0.0.0.0/0.0.0.0:11221
[junit] 2016-07-13 00:37:38,784 [myid:] - INFO  [main:ClientBase@339] - 
STARTING server instance 127.0.0.1:11221
[junit] 2016-07-13 00:37:38,784 [myid:] - INFO  [main:ZooKeeperServer@170] 
- Created server with tickTime 3000 minSessionTimeout 6000 maxSessionTimeout 
6 datadir 
/home/jenkins/jenkins-slave/workspace/ZooKeeper_branch34/branch-3.4/build/test/tmp/test6671674306837553494.junit.dir/version-2
 snapdir 
/home/jenkins/jenkins-slave/workspace/ZooKeeper_branch34/branch-3.4/build/test/tmp/test6671674306837553494.junit.dir/version-2
[junit] 2016-07-13 00:37:38,790 [myid:] - INFO  
[main:FourLetterWordMain@62] - connecting to 127.0.0.1 11221
[junit] 2016-07-13 00:37:38,790 [myid:] - INFO  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:11221:NIOServerCnxnFactory@192] - 
Accepted socket connection from /127.0.0.1:33922
[junit] 2016-07-13 00:37:38,791 [myid:] - INFO  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:11221:NIOServerCnxn@827] - Processing 
stat command from /127.0.0.1:33922
[junit] 2016-07-13 00:37:38,791 [myid:] - INFO  
[Thread-5:NIOServerCnxn$StatCommand@663] - Stat command output
[junit] 2016-07-13 00:37:38,792 [myid:] - INFO  
[Thread-5:NIOServerCnxn@1008] - Closed socket connection for client 
/127.0.0.1:33922 (no session established for client)
[junit] 2016-07-13 00:37:38,792 [myid:] - INFO  [main:JMXEnv@229] - 
ensureParent:[InMemoryDataTree, StandaloneServer_port]
[junit] 2016-07-13 00:37:38,794 [myid:] - INFO  [main:JMXEnv@246] - 
expect:InMemoryDataTree
[junit] 2016-07-13 00:37:38,794 [myid:] - INFO  [main:JMXEnv@250] - 
found:InMemoryDataTree 
org.apache.ZooKeeperService:name0=StandaloneServer_port11221,name1=InMemoryDataTree
[junit] 2016-07-13 00:37:38,795 [myid:] - INFO  [main:JMXEnv@246] - 
expect:StandaloneServer_port
[junit] 2016-07-13 00:37:38,795 [myid:] - INFO  [main:JMXEnv@250] - 
found:StandaloneServer_port 
org.apache.ZooKeeperService:name0=StandaloneServer_port11221
[junit] 2016-07-13 00:37:38,795 [myid:] - INFO  
[main:JUnit4ZKTestRunner$LoggedInvokeMethod@58] - Memory used 31174
[junit] 2016-07-13 00:37:38,796 [myid:] - INFO  
[main:JUnit4ZKTestRunner$LoggedInvokeMethod@63] - Number of threads 20
[junit] 2016-07-13 00:37:38,796 [myid:] - INFO  
[main:JUnit4ZKTestRunner$LoggedInvokeMethod@78] - FINISHED TEST METHOD testQuota
[junit] 2016-07-13 00:37:38,796 [myid:] - INFO  [main:ClientBase@520] - 
tearDown starting
[junit] 2016-07-13 00:37:38,858 [myid:] - INFO  [main:ZooKeeper@684] - 
Session: 0x155e1b00a3e closed
[junit] 2016-07-13 00:37:38,858 [myid:] - INFO  [main:ClientBase@490] - 
STOPPING server
[junit] 2016-07-13 00:37:38,859 [myid:] - INFO  
[main-EventThread:ClientCnxn$EventThread@519] - EventThread shut down for 
session: 0x155e1b00a3e
[junit] 2016-07-13 00:37:38,859 [myid:] - INFO  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:11221:NIOServerCnxnFactory@219] - 
NIOServerCnxn factory exited run method
[junit] 20

[jira] [Commented] (ZOOKEEPER-2152) Intermittent failure in TestReconfig.cc

2016-07-12 Thread Alexander Shraer (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15374048#comment-15374048
 ] 

Alexander Shraer commented on ZOOKEEPER-2152:
-

Its great that the patch solves the current problem we have. But it introduces 
logic that is very specific to address this particular failing test. It may be 
better
to try to generalize a bit - instead of delaying the reconnect infinitely, 
perhaps we should introduce a flag like "reconnect_enabled" (true by default and
false in our test), or reconnect_disabled, which will prevent any reconnect 
attempts from all the different parts of the code, which may be useful in other 
tests. 
What do you think ? I'm not sure how much extra work this is, so its up to you.

> Intermittent failure in TestReconfig.cc
> ---
>
> Key: ZOOKEEPER-2152
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2152
> Project: ZooKeeper
>  Issue Type: Sub-task
>  Components: c client
>Reporter: Michi Mutsuzaki
>Assignee: Michael Han
>  Labels: reconfiguration
> Fix For: 3.5.3, 3.6.0
>
> Attachments: ZOOKEEPER-2152.patch
>
>
> I'm seeing this failure in the c client test once in a while:
> {noformat}
> [exec] 
> /home/jenkins/jenkins-slave/workspace/ZooKeeper-trunk/trunk/src/c/tests/TestReconfig.cc:474:
>  Assertion: assertion failed [Expression: found != string::npos, 
> 10.10.10.4:2004 not in newComing list]
> {noformat}
> https://builds.apache.org/job/ZooKeeper-trunk/2640/console



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-1045) Support Quorum Peer mutual authentication via SASL

2016-07-12 Thread Michael Han (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15374035#comment-15374035
 ] 

Michael Han commented on ZOOKEEPER-1045:


New patch lgtm with regards to address comments I made since last CR. Thanks 
for the work Rakesh! 

> Support Quorum Peer mutual authentication via SASL
> --
>
> Key: ZOOKEEPER-1045
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1045
> Project: ZooKeeper
>  Issue Type: New Feature
>  Components: server
>Reporter: Eugene Koontz
>Assignee: Rakesh R
>Priority: Critical
> Fix For: 3.4.9, 3.5.3
>
> Attachments: 0001-ZOOKEEPER-1045-br-3-4.patch, 
> 1045_failing_phunt.tar.gz, ZK-1045-test-case-failure-logs.zip, 
> ZOOKEEPER-1045-00.patch, ZOOKEEPER-1045-Rolling Upgrade Design Proposal.pdf, 
> ZOOKEEPER-1045-br-3-4.patch, ZOOKEEPER-1045-br-3-4.patch, 
> ZOOKEEPER-1045-br-3-4.patch, ZOOKEEPER-1045-br-3-4.patch, 
> ZOOKEEPER-1045-br-3-4.patch, ZOOKEEPER-1045-br-3-4.patch, 
> ZOOKEEPER-1045TestValidationDesign.pdf
>
>
> ZOOKEEPER-938 addresses mutual authentication between clients and servers. 
> This bug, on the other hand, is for authentication among quorum peers. 
> Hopefully much of the work done on SASL integration with Zookeeper for 
> ZOOKEEPER-938 can be used as a foundation for this enhancement.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-1045) Support Quorum Peer mutual authentication via SASL

2016-07-12 Thread Michael Han (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15374033#comment-15374033
 ] 

Michael Han commented on ZOOKEEPER-1045:


[~rakeshr] Sure, will do. Thanks for providing additional data points from 
testing perspective.

> Support Quorum Peer mutual authentication via SASL
> --
>
> Key: ZOOKEEPER-1045
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1045
> Project: ZooKeeper
>  Issue Type: New Feature
>  Components: server
>Reporter: Eugene Koontz
>Assignee: Rakesh R
>Priority: Critical
> Fix For: 3.4.9, 3.5.3
>
> Attachments: 0001-ZOOKEEPER-1045-br-3-4.patch, 
> 1045_failing_phunt.tar.gz, ZK-1045-test-case-failure-logs.zip, 
> ZOOKEEPER-1045-00.patch, ZOOKEEPER-1045-Rolling Upgrade Design Proposal.pdf, 
> ZOOKEEPER-1045-br-3-4.patch, ZOOKEEPER-1045-br-3-4.patch, 
> ZOOKEEPER-1045-br-3-4.patch, ZOOKEEPER-1045-br-3-4.patch, 
> ZOOKEEPER-1045-br-3-4.patch, ZOOKEEPER-1045-br-3-4.patch, 
> ZOOKEEPER-1045TestValidationDesign.pdf
>
>
> ZOOKEEPER-938 addresses mutual authentication between clients and servers. 
> This bug, on the other hand, is for authentication among quorum peers. 
> Hopefully much of the work done on SASL integration with Zookeeper for 
> ZOOKEEPER-938 can be used as a foundation for this enhancement.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-2152) Intermittent failure in TestReconfig.cc

2016-07-12 Thread Michael Han (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15374026#comment-15374026
 ] 

Michael Han commented on ZOOKEEPER-2152:


I think what you pointed out on "it is utilized only after a successful 
connection attempt" is about correct - we did hit that bug because we always 
ends up here in zookeeper_interest, for this reconfig test due to the fake 
servers:
{code}
   // No need to delay -- grab the next server and attempt connection
zoo_cycle_next_server(zh);
zh->fd = socket(zh->addr_cur.ss_family, SOCK_STREAM, 0);
if (zh->fd < 0) {
  rc = handle_socket_error_msg(zh,
   __LINE__,
   ZSYSTEMERROR,
   "socket() call failed");
  return api_epilog(zh, rc);
}
{code}

So the handle_error has no chance to get invoked.

> Intermittent failure in TestReconfig.cc
> ---
>
> Key: ZOOKEEPER-2152
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2152
> Project: ZooKeeper
>  Issue Type: Sub-task
>  Components: c client
>Reporter: Michi Mutsuzaki
>Assignee: Michael Han
>  Labels: reconfiguration
> Fix For: 3.5.3, 3.6.0
>
> Attachments: ZOOKEEPER-2152.patch
>
>
> I'm seeing this failure in the c client test once in a while:
> {noformat}
> [exec] 
> /home/jenkins/jenkins-slave/workspace/ZooKeeper-trunk/trunk/src/c/tests/TestReconfig.cc:474:
>  Assertion: assertion failed [Expression: found != string::npos, 
> 10.10.10.4:2004 not in newComing list]
> {noformat}
> https://builds.apache.org/job/ZooKeeper-trunk/2640/console



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (ZOOKEEPER-2152) Intermittent failure in TestReconfig.cc

2016-07-12 Thread Alexander Shraer (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15373996#comment-15373996
 ] 

Alexander Shraer edited comment on ZOOKEEPER-2152 at 7/13/16 12:05 AM:
---

I see - the reconnect attempts are spread across multiple functions. Still, I 
don't really understand why the error handling code, which also attempts to 
reconnect to a different server, has no affect on this bug. Maybe it is 
utilized only after a successful connection attempt (which doesn't happen in 
our case) ?


was (Author: shralex):
I see - the reconnect attempts are spread across multiple functions. Still, I 
don't really understand why the error handling code, which also attempts to 
reconnect to a different server, has no affect on this bug. Maybe it is 
utilized only after a successful connection attempt ?

> Intermittent failure in TestReconfig.cc
> ---
>
> Key: ZOOKEEPER-2152
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2152
> Project: ZooKeeper
>  Issue Type: Sub-task
>  Components: c client
>Reporter: Michi Mutsuzaki
>Assignee: Michael Han
>  Labels: reconfiguration
> Fix For: 3.5.3, 3.6.0
>
> Attachments: ZOOKEEPER-2152.patch
>
>
> I'm seeing this failure in the c client test once in a while:
> {noformat}
> [exec] 
> /home/jenkins/jenkins-slave/workspace/ZooKeeper-trunk/trunk/src/c/tests/TestReconfig.cc:474:
>  Assertion: assertion failed [Expression: found != string::npos, 
> 10.10.10.4:2004 not in newComing list]
> {noformat}
> https://builds.apache.org/job/ZooKeeper-trunk/2640/console



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-2466) Client skips servers when trying to connect

2016-07-12 Thread Michael Han (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15374010#comment-15374010
 ] 

Michael Han commented on ZOOKEEPER-2466:


bq. we should advance the pointer of the server list in a single place. 
Agree.

bq. I think we can replace the first call to addrvec_next in the RO block you 
shared with zoo_cycle_next_server
zoo_cycle_next_server does more than addrvec_next - and in some cases the 
addrvec_next might not get called if zoo_cycle_next_server returns earlier 
within the reconfig logic and in this case the current server addr is reset to 
NULL. So it seems to me that addrvec_next and zoo_cycle_next_server is not 
semantically equivalent. 

bq. but not the second if I'm reading this correctly.
Yeah, I think the addrvec_next(&zh->addrs, NULL) just change the state of the 
zh-addrs vector without changing the state of the current server in zh handle - 
zoo_cycle_next_server will change the state of the current server, so both are 
not semantically equivalent.



> Client skips servers when trying to connect
> ---
>
> Key: ZOOKEEPER-2466
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2466
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: c client
>Reporter: Flavio Junqueira
>Assignee: Flavio Junqueira
>Priority: Critical
> Fix For: 3.5.3, 3.6.0
>
>
> I've been looking at {{Zookeeper_simpleSystem::testFirstServerDown}} and I 
> observed the following behavior. The list of servers to connect contains two 
> servers, let's call them S1 and S2. The client never connects, but the odd 
> bit is the sequence of servers that the client tries to connect to:
> {noformat}
> S1
> S2
> S1
> S1
> S1
> 
> {noformat}
> It intrigued me that S2 is only tried once and never again. Checking the 
> code, here is what happens. Initially, {{zh->reconfig}} is 1, so in 
> {{zoo_cycle_next_server}} we return an address from 
> {{get_next_server_in_reconfig}}, which is taken from {{zh->addrs_new}} in 
> this test case. The attempt to connect fails, and {{handle_error}} is invoked 
> in the error handling path. {{handle_error}} actually invokes 
> {{addrvec_next}} which changes the address pointer to the next server on the 
> list.
> After two attempts, it decides that it has tried all servers in 
> {{zoo_cycle_next_server}} and sets {{zh->reconfig}} to zero. Once 
> {{zh->reconfig == 0}}, we have that each call to {{zoo_cycle_next_server}} 
> moves the address pointer to the next server in {{zh->addrs}}. But, given 
> that {{handle_error}} also moves the pointer to the next server, we end up 
> moving the pointer ahead twice upon every failed attempt to connect, which is 
> wrong.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-2152) Intermittent failure in TestReconfig.cc

2016-07-12 Thread Alexander Shraer (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15373996#comment-15373996
 ] 

Alexander Shraer commented on ZOOKEEPER-2152:
-

I see - the reconnect attempts are spread across multiple functions. Still, I 
don't really understand why the error handling code, which also attempts to 
reconnect to a different server, has no affect on this bug. Maybe it is 
utilized only after a successful connection attempt ?

> Intermittent failure in TestReconfig.cc
> ---
>
> Key: ZOOKEEPER-2152
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2152
> Project: ZooKeeper
>  Issue Type: Sub-task
>  Components: c client
>Reporter: Michi Mutsuzaki
>Assignee: Michael Han
>  Labels: reconfiguration
> Fix For: 3.5.3, 3.6.0
>
> Attachments: ZOOKEEPER-2152.patch
>
>
> I'm seeing this failure in the c client test once in a while:
> {noformat}
> [exec] 
> /home/jenkins/jenkins-slave/workspace/ZooKeeper-trunk/trunk/src/c/tests/TestReconfig.cc:474:
>  Assertion: assertion failed [Expression: found != string::npos, 
> 10.10.10.4:2004 not in newComing list]
> {noformat}
> https://builds.apache.org/job/ZooKeeper-trunk/2640/console



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


ZooKeeper_branch34_jdk8 - Build # 629 - Failure

2016-07-12 Thread Apache Jenkins Server
See https://builds.apache.org/job/ZooKeeper_branch34_jdk8/629/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 218398 lines...]
[junit] 2016-07-12 23:08:20,655 [myid:] - INFO  [main:JMXEnv@246] - 
expect:StandaloneServer_port
[junit] 2016-07-12 23:08:20,655 [myid:] - INFO  [main:JMXEnv@250] - 
found:StandaloneServer_port 
org.apache.ZooKeeperService:name0=StandaloneServer_port11221
[junit] 2016-07-12 23:08:20,655 [myid:] - INFO  [main:ClientBase@490] - 
STOPPING server
[junit] 2016-07-12 23:08:20,655 [myid:] - INFO  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:11221:NIOServerCnxnFactory@219] - 
NIOServerCnxn factory exited run method
[junit] 2016-07-12 23:08:20,656 [myid:] - INFO  [main:ZooKeeperServer@469] 
- shutting down
[junit] 2016-07-12 23:08:20,656 [myid:] - INFO  
[main:SessionTrackerImpl@225] - Shutting down
[junit] 2016-07-12 23:08:20,656 [myid:] - INFO  
[main:PrepRequestProcessor@765] - Shutting down
[junit] 2016-07-12 23:08:20,656 [myid:] - INFO  
[main:SyncRequestProcessor@209] - Shutting down
[junit] 2016-07-12 23:08:20,656 [myid:] - INFO  [ProcessThread(sid:0 
cport:11221)::PrepRequestProcessor@143] - PrepRequestProcessor exited loop!
[junit] 2016-07-12 23:08:20,656 [myid:] - INFO  
[SyncThread:0:SyncRequestProcessor@187] - SyncRequestProcessor exited!
[junit] 2016-07-12 23:08:20,657 [myid:] - INFO  
[main:FinalRequestProcessor@402] - shutdown of request processor complete
[junit] 2016-07-12 23:08:20,657 [myid:] - INFO  
[main:FourLetterWordMain@62] - connecting to 127.0.0.1 11221
[junit] 2016-07-12 23:08:20,658 [myid:] - INFO  [main:JMXEnv@146] - 
ensureOnly:[]
[junit] 2016-07-12 23:08:20,659 [myid:] - INFO  [main:ClientBase@443] - 
STARTING server
[junit] 2016-07-12 23:08:20,659 [myid:] - INFO  [main:ClientBase@364] - 
CREATING server instance 127.0.0.1:11221
[junit] 2016-07-12 23:08:20,660 [myid:] - INFO  
[main:NIOServerCnxnFactory@89] - binding to port 0.0.0.0/0.0.0.0:11221
[junit] 2016-07-12 23:08:20,660 [myid:] - INFO  [main:ClientBase@339] - 
STARTING server instance 127.0.0.1:11221
[junit] 2016-07-12 23:08:20,660 [myid:] - INFO  [main:ZooKeeperServer@170] 
- Created server with tickTime 3000 minSessionTimeout 6000 maxSessionTimeout 
6 datadir 
/x1/jenkins/jenkins-slave/workspace/ZooKeeper_branch34_jdk8/branch-3.4/build/test/tmp/test2715003673894370364.junit.dir/version-2
 snapdir 
/x1/jenkins/jenkins-slave/workspace/ZooKeeper_branch34_jdk8/branch-3.4/build/test/tmp/test2715003673894370364.junit.dir/version-2
[junit] 2016-07-12 23:08:20,664 [myid:] - INFO  
[main:FourLetterWordMain@62] - connecting to 127.0.0.1 11221
[junit] 2016-07-12 23:08:20,664 [myid:] - INFO  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:11221:NIOServerCnxnFactory@192] - 
Accepted socket connection from /127.0.0.1:53558
[junit] 2016-07-12 23:08:20,665 [myid:] - INFO  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:11221:NIOServerCnxn@827] - Processing 
stat command from /127.0.0.1:53558
[junit] 2016-07-12 23:08:20,665 [myid:] - INFO  
[Thread-4:NIOServerCnxn$StatCommand@663] - Stat command output
[junit] 2016-07-12 23:08:20,666 [myid:] - INFO  
[Thread-4:NIOServerCnxn@1008] - Closed socket connection for client 
/127.0.0.1:53558 (no session established for client)
[junit] 2016-07-12 23:08:20,666 [myid:] - INFO  [main:JMXEnv@229] - 
ensureParent:[InMemoryDataTree, StandaloneServer_port]
[junit] 2016-07-12 23:08:20,668 [myid:] - INFO  [main:JMXEnv@246] - 
expect:InMemoryDataTree
[junit] 2016-07-12 23:08:20,668 [myid:] - INFO  [main:JMXEnv@250] - 
found:InMemoryDataTree 
org.apache.ZooKeeperService:name0=StandaloneServer_port11221,name1=InMemoryDataTree
[junit] 2016-07-12 23:08:20,668 [myid:] - INFO  [main:JMXEnv@246] - 
expect:StandaloneServer_port
[junit] 2016-07-12 23:08:20,668 [myid:] - INFO  [main:JMXEnv@250] - 
found:StandaloneServer_port 
org.apache.ZooKeeperService:name0=StandaloneServer_port11221
[junit] 2016-07-12 23:08:20,669 [myid:] - INFO  
[main:JUnit4ZKTestRunner$LoggedInvokeMethod@58] - Memory used 5673
[junit] 2016-07-12 23:08:20,669 [myid:] - INFO  
[main:JUnit4ZKTestRunner$LoggedInvokeMethod@63] - Number of threads 20
[junit] 2016-07-12 23:08:20,669 [myid:] - INFO  
[main:JUnit4ZKTestRunner$LoggedInvokeMethod@78] - FINISHED TEST METHOD testQuota
[junit] 2016-07-12 23:08:20,669 [myid:] - INFO  [main:ClientBase@520] - 
tearDown starting
[junit] 2016-07-12 23:08:20,741 [myid:] - INFO  [main:ZooKeeper@684] - 
Session: 0x155e15e47c7 closed
[junit] 2016-07-12 23:08:20,741 [myid:] - INFO  
[main-EventThread:ClientCnxn$EventThread@519] - EventThread shut down for 
session: 0x155e15e47c7
[junit] 2016-07-12 23:08:20,742 [myid:] - INFO  [main:ClientBase@490] - 
STOPPING server
[junit] 2016-07-12 23:08:20,742 [myid:] - INFO  
[NIOServerCxn.Fa

[jira] [Commented] (ZOOKEEPER-2152) Intermittent failure in TestReconfig.cc

2016-07-12 Thread Michael Han (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15373894#comment-15373894
 ] 

Michael Han commented on ZOOKEEPER-2152:


Hey Alex, removing zoo_cycle_next_server from error handling code will not fix 
this case, because the root cause of this case does not involve the 
'handle_error' call we plan to update in ZOOKEEPER-2466. The code path that 
this patch tried to address is a different one, in zookeeper_interest:
{code}
if (*fd == -1) {
  if (zh->delay == 1) { // The patch introduced a test flag that will always 
make this branch taken, instead of the else branch. It makes sense for reconfig 
C test case because the servers used for testing are fake, so the error 
handling branch (the else branch here) will always be taken, where the 
zoo_cycle_next_server will be invoked which introduces the data race with 
client side call of zoo_cycle_next server.
// .
  } else {
// We don't want to execute this branch given how our test servers are set 
up - unless we simulate and spawn real servers, which involves more work. It's 
probably something good to have someday to use real server for reconfig test 
though. 
  }
}
{code}

> Intermittent failure in TestReconfig.cc
> ---
>
> Key: ZOOKEEPER-2152
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2152
> Project: ZooKeeper
>  Issue Type: Sub-task
>  Components: c client
>Reporter: Michi Mutsuzaki
>Assignee: Michael Han
>  Labels: reconfiguration
> Fix For: 3.5.3, 3.6.0
>
> Attachments: ZOOKEEPER-2152.patch
>
>
> I'm seeing this failure in the c client test once in a while:
> {noformat}
> [exec] 
> /home/jenkins/jenkins-slave/workspace/ZooKeeper-trunk/trunk/src/c/tests/TestReconfig.cc:474:
>  Assertion: assertion failed [Expression: found != string::npos, 
> 10.10.10.4:2004 not in newComing list]
> {noformat}
> https://builds.apache.org/job/ZooKeeper-trunk/2640/console



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Script to spin up ZK cluster

2016-07-12 Thread Flavio Junqueira
Awesome, thanks for doing this, Michael.

-Flavio

> On 12 Jul 2016, at 12:01, Michael Han  wrote:
> 
> Hi Devs,
> 
> I just posted some scripts that can automate installation of ZK on a
> provisioned cluster.
> https://github.com/hanm/zk-devops/tree/master/ansible
> 
> Please give it a try if you are interested.  Feature requests, comments and
> feedbacks are welcome.
> 
> -- 
> Cheers
> Michael.



[jira] [Commented] (ZOOKEEPER-2152) Intermittent failure in TestReconfig.cc

2016-07-12 Thread Alexander Shraer (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15373790#comment-15373790
 ] 

Alexander Shraer commented on ZOOKEEPER-2152:
-

Michael, if the fix for ZOOKEEPER-2466 ends up removing the call to 
zoo_cycle_next_server from the error handling code, will this also solve the 
problem you found in this JIRA ? In this case we should probably wait for the 
fix there.

> Intermittent failure in TestReconfig.cc
> ---
>
> Key: ZOOKEEPER-2152
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2152
> Project: ZooKeeper
>  Issue Type: Sub-task
>  Components: c client
>Reporter: Michi Mutsuzaki
>Assignee: Michael Han
>  Labels: reconfiguration
> Fix For: 3.5.3, 3.6.0
>
> Attachments: ZOOKEEPER-2152.patch
>
>
> I'm seeing this failure in the c client test once in a while:
> {noformat}
> [exec] 
> /home/jenkins/jenkins-slave/workspace/ZooKeeper-trunk/trunk/src/c/tests/TestReconfig.cc:474:
>  Assertion: assertion failed [Expression: found != string::npos, 
> 10.10.10.4:2004 not in newComing list]
> {noformat}
> https://builds.apache.org/job/ZooKeeper-trunk/2640/console



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-1045) Support Quorum Peer mutual authentication via SASL

2016-07-12 Thread Rakesh R (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15373577#comment-15373577
 ] 

Rakesh R commented on ZOOKEEPER-1045:
-

Adding two more points to the above list:

# Please include the {{Java version}} used for the ZooKeeper project 
compilation as 3.4.x supports Java 1.6 and later.
# Few weeks back I've tried integration with Hbase project. I've mvn 
compiled/tested hbase "branch-1.2" (https://github.com/apache/hbase.git) 
project using zookeeper-3.4.9-SNAPSHOT.jar(contains latest qp sasl feature 
code) artifact. I got SUCCESS result. Probably you can capture this info in the 
validation report.
I've used the following revision for the integration testing.
{{HBase git revision da52e0cdf109199b157ccaedbd891336221c439b}}

> Support Quorum Peer mutual authentication via SASL
> --
>
> Key: ZOOKEEPER-1045
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1045
> Project: ZooKeeper
>  Issue Type: New Feature
>  Components: server
>Reporter: Eugene Koontz
>Assignee: Rakesh R
>Priority: Critical
> Fix For: 3.4.9, 3.5.3
>
> Attachments: 0001-ZOOKEEPER-1045-br-3-4.patch, 
> 1045_failing_phunt.tar.gz, ZK-1045-test-case-failure-logs.zip, 
> ZOOKEEPER-1045-00.patch, ZOOKEEPER-1045-Rolling Upgrade Design Proposal.pdf, 
> ZOOKEEPER-1045-br-3-4.patch, ZOOKEEPER-1045-br-3-4.patch, 
> ZOOKEEPER-1045-br-3-4.patch, ZOOKEEPER-1045-br-3-4.patch, 
> ZOOKEEPER-1045-br-3-4.patch, ZOOKEEPER-1045-br-3-4.patch, 
> ZOOKEEPER-1045TestValidationDesign.pdf
>
>
> ZOOKEEPER-938 addresses mutual authentication between clients and servers. 
> This bug, on the other hand, is for authentication among quorum peers. 
> Hopefully much of the work done on SASL integration with Zookeeper for 
> ZOOKEEPER-938 can be used as a foundation for this enhancement.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-1045) Support Quorum Peer mutual authentication via SASL

2016-07-12 Thread Rakesh R (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15373532#comment-15373532
 ] 

Rakesh R commented on ZOOKEEPER-1045:
-

Attached new patch with the following changes:

# Fixed [~hanm]'s review comments given in the review ticket. Also, fixed one 
review comment given in this jira to modify {{ReadMe.md}} content.
# Since the current patch is supporting  only single (shared) Kerberos 
principal I've provided simple string comparison for authorization. Presently 
{{quorum.auth.kerberos.servicePrincipal}} holds the shared principal value. One 
idea to support each of the servers individual credentials is by extending the 
configuration with respective {{myid}} value as shown below. IMHO, we could 
push the basic patch first and discuss individual credentials logic and their 
authorization part separately as the current patch is getting bigger and 
bigger. [~phunt], whats your opinion?
{code}
quorum.auth.kerberos.servicePrincipal.1 = "QuorumServer1"
quorum.auth.kerberos.servicePrincipal.2 = "QuorumServer2"
quorum.auth.kerberos.servicePrincipal.3 = "QuorumServer3"
{code}
# Exposed {{fleTimeTaken}} value via jmx bean attribute, which can be used to 
see the total time taken for LE.
# Added few more unit test cases to cover newly introduced {{ConfigException}} 
cases.

> Support Quorum Peer mutual authentication via SASL
> --
>
> Key: ZOOKEEPER-1045
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1045
> Project: ZooKeeper
>  Issue Type: New Feature
>  Components: server
>Reporter: Eugene Koontz
>Assignee: Rakesh R
>Priority: Critical
> Fix For: 3.4.9, 3.5.3
>
> Attachments: 0001-ZOOKEEPER-1045-br-3-4.patch, 
> 1045_failing_phunt.tar.gz, ZK-1045-test-case-failure-logs.zip, 
> ZOOKEEPER-1045-00.patch, ZOOKEEPER-1045-Rolling Upgrade Design Proposal.pdf, 
> ZOOKEEPER-1045-br-3-4.patch, ZOOKEEPER-1045-br-3-4.patch, 
> ZOOKEEPER-1045-br-3-4.patch, ZOOKEEPER-1045-br-3-4.patch, 
> ZOOKEEPER-1045-br-3-4.patch, ZOOKEEPER-1045-br-3-4.patch, 
> ZOOKEEPER-1045TestValidationDesign.pdf
>
>
> ZOOKEEPER-938 addresses mutual authentication between clients and servers. 
> This bug, on the other hand, is for authentication among quorum peers. 
> Hopefully much of the work done on SASL integration with Zookeeper for 
> ZOOKEEPER-938 can be used as a foundation for this enhancement.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 47354: ZOOKEEPER-1045 : Quorum mutual authentication using SASL mechanism

2016-07-12 Thread Rakesh R

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/47354/
---

(Updated July 12, 2016, 7:28 p.m.)


Review request for zookeeper, fpj, Ivan Kelly, Patrick Hunt, and Raul Gutierrez 
Segales.


Bugs: ZOOKEEPER-1045
https://issues.apache.org/jira/browse/ZOOKEEPER-1045


Repository: zookeeper-git


Description
---

Quorum mutual authentication using SASL mechanism - Digest/Kerberos


Diffs (updated)
-

  build.xml ab254b2 
  ivy.xml 95b0e5a 
  src/java/main/org/apache/zookeeper/Login.java aaa220c 
  src/java/main/org/apache/zookeeper/SaslClientCallbackHandler.java 
PRE-CREATION 
  src/java/main/org/apache/zookeeper/client/ZooKeeperSaslClient.java 21ef0fa 
  src/java/main/org/apache/zookeeper/server/ZooKeeperSaslServer.java 71870ce 
  src/java/main/org/apache/zookeeper/server/auth/SaslServerCallbackHandler.java 
2fbd6ed 
  src/java/main/org/apache/zookeeper/server/quorum/FastLeaderElection.java 
78f3aa6 
  src/java/main/org/apache/zookeeper/server/quorum/Follower.java 2aeb0c3 
  src/java/main/org/apache/zookeeper/server/quorum/FollowerBean.java fd31fa2 
  src/java/main/org/apache/zookeeper/server/quorum/FollowerMXBean.java ded0e1c 
  src/java/main/org/apache/zookeeper/server/quorum/Leader.java 40c6748 
  src/java/main/org/apache/zookeeper/server/quorum/LeaderBean.java b5a3a10 
  src/java/main/org/apache/zookeeper/server/quorum/LeaderMXBean.java bf08104 
  src/java/main/org/apache/zookeeper/server/quorum/Learner.java c73a8ee 
  src/java/main/org/apache/zookeeper/server/quorum/LearnerHandler.java 8a748c7 
  src/java/main/org/apache/zookeeper/server/quorum/QuorumCnxManager.java 
20e5f16 
  src/java/main/org/apache/zookeeper/server/quorum/QuorumPeer.java 2f0f21b 
  src/java/main/org/apache/zookeeper/server/quorum/QuorumPeerConfig.java 
8ae820d 
  src/java/main/org/apache/zookeeper/server/quorum/QuorumPeerMain.java e9c8007 
  
src/java/main/org/apache/zookeeper/server/quorum/auth/NullQuorumAuthLearner.java
 PRE-CREATION 
  
src/java/main/org/apache/zookeeper/server/quorum/auth/NullQuorumAuthServer.java 
PRE-CREATION 
  src/java/main/org/apache/zookeeper/server/quorum/auth/QuorumAuth.java 
PRE-CREATION 
  src/java/main/org/apache/zookeeper/server/quorum/auth/QuorumAuthLearner.java 
PRE-CREATION 
  src/java/main/org/apache/zookeeper/server/quorum/auth/QuorumAuthServer.java 
PRE-CREATION 
  src/java/main/org/apache/zookeeper/server/quorum/auth/README.md PRE-CREATION 
  
src/java/main/org/apache/zookeeper/server/quorum/auth/SaslQuorumAuthLearner.java
 PRE-CREATION 
  
src/java/main/org/apache/zookeeper/server/quorum/auth/SaslQuorumAuthServer.java 
PRE-CREATION 
  
src/java/main/org/apache/zookeeper/server/quorum/auth/SaslQuorumServerCallbackHandler.java
 PRE-CREATION 
  src/java/main/org/apache/zookeeper/util/SecurityUtils.java PRE-CREATION 
  src/java/test/data/kerberos/minikdc-krb5.conf PRE-CREATION 
  src/java/test/data/kerberos/minikdc.ldiff PRE-CREATION 
  src/java/test/org/apache/zookeeper/server/quorum/CnxManagerTest.java 831d3ed 
  
src/java/test/org/apache/zookeeper/server/quorum/FLEBackwardElectionRoundTest.java
 c1259d1 
  src/java/test/org/apache/zookeeper/server/quorum/FLECompatibilityTest.java 
72e4fc9 
  src/java/test/org/apache/zookeeper/server/quorum/FLEDontCareTest.java a4c0cb0 
  src/java/test/org/apache/zookeeper/server/quorum/FLELostMessageTest.java 
39a53ca 
  src/java/test/org/apache/zookeeper/server/quorum/LearnerTest.java 2ae57ce 
  src/java/test/org/apache/zookeeper/server/quorum/QuorumCnxManagerTest.java 
PRE-CREATION 
  src/java/test/org/apache/zookeeper/server/quorum/QuorumPeerTestBase.java 
ef552db 
  src/java/test/org/apache/zookeeper/server/quorum/Zab1_0Test.java ab8ce42 
  
src/java/test/org/apache/zookeeper/server/quorum/auth/KerberosSecurityTestcase.java
 PRE-CREATION 
  src/java/test/org/apache/zookeeper/server/quorum/auth/KerberosTestUtils.java 
PRE-CREATION 
  src/java/test/org/apache/zookeeper/server/quorum/auth/MiniKdc.java 
PRE-CREATION 
  src/java/test/org/apache/zookeeper/server/quorum/auth/MiniKdcTest.java 
PRE-CREATION 
  src/java/test/org/apache/zookeeper/server/quorum/auth/QuorumAuthTestBase.java 
PRE-CREATION 
  
src/java/test/org/apache/zookeeper/server/quorum/auth/QuorumAuthUpgradeTest.java
 PRE-CREATION 
  
src/java/test/org/apache/zookeeper/server/quorum/auth/QuorumDigestAuthTest.java 
PRE-CREATION 
  
src/java/test/org/apache/zookeeper/server/quorum/auth/QuorumKerberosAuthTest.java
 PRE-CREATION 
  src/java/test/org/apache/zookeeper/test/FLEPredicateTest.java 8088505 
  src/zookeeper.jute 6521e54 

Diff: https://reviews.apache.org/r/47354/diff/


Testing
---

Added unit test cases to verify the changes.


Thanks,

Rakesh R



[jira] [Updated] (ZOOKEEPER-1045) Support Quorum Peer mutual authentication via SASL

2016-07-12 Thread Rakesh R (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rakesh R updated ZOOKEEPER-1045:

Attachment: ZOOKEEPER-1045-br-3-4.patch

> Support Quorum Peer mutual authentication via SASL
> --
>
> Key: ZOOKEEPER-1045
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1045
> Project: ZooKeeper
>  Issue Type: New Feature
>  Components: server
>Reporter: Eugene Koontz
>Assignee: Rakesh R
>Priority: Critical
> Fix For: 3.4.9, 3.5.3
>
> Attachments: 0001-ZOOKEEPER-1045-br-3-4.patch, 
> 1045_failing_phunt.tar.gz, ZK-1045-test-case-failure-logs.zip, 
> ZOOKEEPER-1045-00.patch, ZOOKEEPER-1045-Rolling Upgrade Design Proposal.pdf, 
> ZOOKEEPER-1045-br-3-4.patch, ZOOKEEPER-1045-br-3-4.patch, 
> ZOOKEEPER-1045-br-3-4.patch, ZOOKEEPER-1045-br-3-4.patch, 
> ZOOKEEPER-1045-br-3-4.patch, ZOOKEEPER-1045-br-3-4.patch, 
> ZOOKEEPER-1045TestValidationDesign.pdf
>
>
> ZOOKEEPER-938 addresses mutual authentication between clients and servers. 
> This bug, on the other hand, is for authentication among quorum peers. 
> Hopefully much of the work done on SASL integration with Zookeeper for 
> ZOOKEEPER-938 can be used as a foundation for this enhancement.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Script to spin up ZK cluster

2016-07-12 Thread Michael Han
Hi Devs,

I just posted some scripts that can automate installation of ZK on a
provisioned cluster.
https://github.com/hanm/zk-devops/tree/master/ansible

Please give it a try if you are interested.  Feature requests, comments and
feedbacks are welcome.

-- 
Cheers
Michael.


[jira] [Updated] (ZOOKEEPER-1045) Support Quorum Peer mutual authentication via SASL

2016-07-12 Thread Rakesh R (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rakesh R updated ZOOKEEPER-1045:

Attachment: (was: ZOOKEEPER-1045-br-3-4.patch)

> Support Quorum Peer mutual authentication via SASL
> --
>
> Key: ZOOKEEPER-1045
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1045
> Project: ZooKeeper
>  Issue Type: New Feature
>  Components: server
>Reporter: Eugene Koontz
>Assignee: Rakesh R
>Priority: Critical
> Fix For: 3.4.9, 3.5.3
>
> Attachments: 0001-ZOOKEEPER-1045-br-3-4.patch, 
> 1045_failing_phunt.tar.gz, ZK-1045-test-case-failure-logs.zip, 
> ZOOKEEPER-1045-00.patch, ZOOKEEPER-1045-Rolling Upgrade Design Proposal.pdf, 
> ZOOKEEPER-1045-br-3-4.patch, ZOOKEEPER-1045-br-3-4.patch, 
> ZOOKEEPER-1045-br-3-4.patch, ZOOKEEPER-1045-br-3-4.patch, 
> ZOOKEEPER-1045-br-3-4.patch, ZOOKEEPER-1045TestValidationDesign.pdf
>
>
> ZOOKEEPER-938 addresses mutual authentication between clients and servers. 
> This bug, on the other hand, is for authentication among quorum peers. 
> Hopefully much of the work done on SASL integration with Zookeeper for 
> ZOOKEEPER-938 can be used as a foundation for this enhancement.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


ApacheCon Europe Call For Papers Open

2016-07-12 Thread Rich Bowen
As you are no doubt already aware, we will be holding ApacheCon in
Seville, Spain, the week of November 14th, 2016. The call for papers
(CFP) for this event is now open, and will remain open until 
September 9th.

The event is divided into two parts, each with its own CFP. The first
part of the event, called Apache Big Data, focuses on Big Data
projects and related technologies.

Website: http://events.linuxfoundation.org/events/apache-big-data-europe
CFP: http://events.linuxfoundation.org/events/apache-big-data-europe/program/cfp

The second part, called ApacheCon Europe, focuses on the Apache
Software Foundation as a whole, covering all projects, community
issues, governance, and so on.

Website: http://events.linuxfoundation.org/events/apachecon-europe
CFP: http://events.linuxfoundation.org/events/apachecon-europe/program/cfp

ApacheCon is the official conference of the Apache Software
Foundation, and is the best place to meet members of your project and
other ASF projects, and strengthen your project's community.

If your organization is interested in sponsoring ApacheCon, contact me
at e...@apache.org  ApacheCon is a great place to find the brightest
developers in the world, and experts on a huge range of technologies.

I hope to see you in Seville!



[jira] [Updated] (ZOOKEEPER-1045) Support Quorum Peer mutual authentication via SASL

2016-07-12 Thread Rakesh R (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rakesh R updated ZOOKEEPER-1045:

Attachment: ZOOKEEPER-1045-br-3-4.patch

> Support Quorum Peer mutual authentication via SASL
> --
>
> Key: ZOOKEEPER-1045
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1045
> Project: ZooKeeper
>  Issue Type: New Feature
>  Components: server
>Reporter: Eugene Koontz
>Assignee: Rakesh R
>Priority: Critical
> Fix For: 3.4.9, 3.5.3
>
> Attachments: 0001-ZOOKEEPER-1045-br-3-4.patch, 
> 1045_failing_phunt.tar.gz, ZK-1045-test-case-failure-logs.zip, 
> ZOOKEEPER-1045-00.patch, ZOOKEEPER-1045-Rolling Upgrade Design Proposal.pdf, 
> ZOOKEEPER-1045-br-3-4.patch, ZOOKEEPER-1045-br-3-4.patch, 
> ZOOKEEPER-1045-br-3-4.patch, ZOOKEEPER-1045-br-3-4.patch, 
> ZOOKEEPER-1045-br-3-4.patch, ZOOKEEPER-1045-br-3-4.patch, 
> ZOOKEEPER-1045TestValidationDesign.pdf
>
>
> ZOOKEEPER-938 addresses mutual authentication between clients and servers. 
> This bug, on the other hand, is for authentication among quorum peers. 
> Hopefully much of the work done on SASL integration with Zookeeper for 
> ZOOKEEPER-938 can be used as a foundation for this enhancement.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 47354: ZOOKEEPER-1045 : Quorum mutual authentication using SASL mechanism

2016-07-12 Thread Rakesh R


> On June 28, 2016, 11:56 p.m., Michael Han wrote:
> > src/java/main/org/apache/zookeeper/server/quorum/auth/SaslQuorumAuthServer.java,
> >  line 121
> > 
> >
> > I recommend logging the content of the exception here which would make 
> > trouble shooting easier. Learned this today from an end user who tried to 
> > figure out why Kerberos auth not working (still not work, but close) and 
> > the existing log does not provide much info if the exception is thrown from 
> > here (in my case, the exception is 'GSS initiate failed').

Added error logs like,

LOG.error("Failed to authenticate using SASL", e);


- Rakesh


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/47354/#review139902
---


On June 26, 2016, 9:13 a.m., Rakesh R wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/47354/
> ---
> 
> (Updated June 26, 2016, 9:13 a.m.)
> 
> 
> Review request for zookeeper, fpj, Ivan Kelly, Patrick Hunt, and Raul 
> Gutierrez Segales.
> 
> 
> Bugs: ZOOKEEPER-1045
> https://issues.apache.org/jira/browse/ZOOKEEPER-1045
> 
> 
> Repository: zookeeper-git
> 
> 
> Description
> ---
> 
> Quorum mutual authentication using SASL mechanism - Digest/Kerberos
> 
> 
> Diffs
> -
> 
>   build.xml ab254b2 
>   ivy.xml 95b0e5a 
>   src/java/main/org/apache/zookeeper/Login.java aaa220c 
>   src/java/main/org/apache/zookeeper/client/ZooKeeperSaslClient.java 21ef0fa 
>   src/java/main/org/apache/zookeeper/server/ZooKeeperSaslServer.java 71870ce 
>   
> src/java/main/org/apache/zookeeper/server/auth/SaslServerCallbackHandler.java 
> 2fbd6ed 
>   src/java/main/org/apache/zookeeper/server/quorum/Leader.java 40c6748 
>   src/java/main/org/apache/zookeeper/server/quorum/Learner.java c73a8ee 
>   src/java/main/org/apache/zookeeper/server/quorum/LearnerHandler.java 
> 8a748c7 
>   src/java/main/org/apache/zookeeper/server/quorum/QuorumCnxManager.java 
> 20e5f16 
>   src/java/main/org/apache/zookeeper/server/quorum/QuorumPeer.java 2f0f21b 
>   src/java/main/org/apache/zookeeper/server/quorum/QuorumPeerConfig.java 
> 8ae820d 
>   src/java/main/org/apache/zookeeper/server/quorum/QuorumPeerMain.java 
> e9c8007 
>   
> src/java/main/org/apache/zookeeper/server/quorum/auth/NullQuorumAuthLearner.java
>  PRE-CREATION 
>   
> src/java/main/org/apache/zookeeper/server/quorum/auth/NullQuorumAuthServer.java
>  PRE-CREATION 
>   src/java/main/org/apache/zookeeper/server/quorum/auth/QuorumAuth.java 
> PRE-CREATION 
>   
> src/java/main/org/apache/zookeeper/server/quorum/auth/QuorumAuthLearner.java 
> PRE-CREATION 
>   src/java/main/org/apache/zookeeper/server/quorum/auth/QuorumAuthServer.java 
> PRE-CREATION 
>   src/java/main/org/apache/zookeeper/server/quorum/auth/README.md 
> PRE-CREATION 
>   
> src/java/main/org/apache/zookeeper/server/quorum/auth/SaslQuorumAuthLearner.java
>  PRE-CREATION 
>   
> src/java/main/org/apache/zookeeper/server/quorum/auth/SaslQuorumAuthServer.java
>  PRE-CREATION 
>   src/java/main/org/apache/zookeeper/util/SecurityUtils.java PRE-CREATION 
>   src/java/test/data/kerberos/minikdc-krb5.conf PRE-CREATION 
>   src/java/test/data/kerberos/minikdc.ldiff PRE-CREATION 
>   src/java/test/org/apache/zookeeper/server/quorum/CnxManagerTest.java 
> 831d3ed 
>   
> src/java/test/org/apache/zookeeper/server/quorum/FLEBackwardElectionRoundTest.java
>  c1259d1 
>   src/java/test/org/apache/zookeeper/server/quorum/FLECompatibilityTest.java 
> 72e4fc9 
>   src/java/test/org/apache/zookeeper/server/quorum/FLEDontCareTest.java 
> a4c0cb0 
>   src/java/test/org/apache/zookeeper/server/quorum/FLELostMessageTest.java 
> 39a53ca 
>   src/java/test/org/apache/zookeeper/server/quorum/LearnerTest.java 2ae57ce 
>   src/java/test/org/apache/zookeeper/server/quorum/QuorumCnxManagerTest.java 
> PRE-CREATION 
>   src/java/test/org/apache/zookeeper/server/quorum/QuorumPeerTestBase.java 
> ef552db 
>   src/java/test/org/apache/zookeeper/server/quorum/Zab1_0Test.java ab8ce42 
>   
> src/java/test/org/apache/zookeeper/server/quorum/auth/KerberosSecurityTestcase.java
>  PRE-CREATION 
>   
> src/java/test/org/apache/zookeeper/server/quorum/auth/KerberosTestUtils.java 
> PRE-CREATION 
>   src/java/test/org/apache/zookeeper/server/quorum/auth/MiniKdc.java 
> PRE-CREATION 
>   src/java/test/org/apache/zookeeper/server/quorum/auth/MiniKdcTest.java 
> PRE-CREATION 
>   
> src/java/test/org/apache/zookeeper/server/quorum/auth/QuorumAuthTestBase.java 
> PRE-CREATION 
>   
> src/java/test/org/apache/zookeeper/server/quorum/auth/QuorumAuthUpgradeTest.java
>  PRE-CREATION 
>   
> src/java/test/org/apache/zookeeper/server/quorum/auth/QuorumDigestAuthTest.ja

Re: Review Request 47354: ZOOKEEPER-1045 : Quorum mutual authentication using SASL mechanism

2016-07-12 Thread Rakesh R


> On June 28, 2016, 8:49 p.m., Michael Han wrote:
> > src/java/main/org/apache/zookeeper/server/quorum/auth/QuorumAuthLearner.java,
> >  line 34
> > 
> >
> > The return value in the function signature is void, and the comment is 
> > saying it should be boolean. So this should be fixed.

Removed return value from the javadoc


> On June 28, 2016, 8:49 p.m., Michael Han wrote:
> > src/java/main/org/apache/zookeeper/server/quorum/auth/QuorumAuthServer.java,
> >  line 37
> > 
> >
> > The return value in the function signature is void, and the comment is 
> > saying it should be boolean. So this should be fixed.

Removed return value from the javadoc


- Rakesh


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/47354/#review139854
---


On June 26, 2016, 9:13 a.m., Rakesh R wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/47354/
> ---
> 
> (Updated June 26, 2016, 9:13 a.m.)
> 
> 
> Review request for zookeeper, fpj, Ivan Kelly, Patrick Hunt, and Raul 
> Gutierrez Segales.
> 
> 
> Bugs: ZOOKEEPER-1045
> https://issues.apache.org/jira/browse/ZOOKEEPER-1045
> 
> 
> Repository: zookeeper-git
> 
> 
> Description
> ---
> 
> Quorum mutual authentication using SASL mechanism - Digest/Kerberos
> 
> 
> Diffs
> -
> 
>   build.xml ab254b2 
>   ivy.xml 95b0e5a 
>   src/java/main/org/apache/zookeeper/Login.java aaa220c 
>   src/java/main/org/apache/zookeeper/client/ZooKeeperSaslClient.java 21ef0fa 
>   src/java/main/org/apache/zookeeper/server/ZooKeeperSaslServer.java 71870ce 
>   
> src/java/main/org/apache/zookeeper/server/auth/SaslServerCallbackHandler.java 
> 2fbd6ed 
>   src/java/main/org/apache/zookeeper/server/quorum/Leader.java 40c6748 
>   src/java/main/org/apache/zookeeper/server/quorum/Learner.java c73a8ee 
>   src/java/main/org/apache/zookeeper/server/quorum/LearnerHandler.java 
> 8a748c7 
>   src/java/main/org/apache/zookeeper/server/quorum/QuorumCnxManager.java 
> 20e5f16 
>   src/java/main/org/apache/zookeeper/server/quorum/QuorumPeer.java 2f0f21b 
>   src/java/main/org/apache/zookeeper/server/quorum/QuorumPeerConfig.java 
> 8ae820d 
>   src/java/main/org/apache/zookeeper/server/quorum/QuorumPeerMain.java 
> e9c8007 
>   
> src/java/main/org/apache/zookeeper/server/quorum/auth/NullQuorumAuthLearner.java
>  PRE-CREATION 
>   
> src/java/main/org/apache/zookeeper/server/quorum/auth/NullQuorumAuthServer.java
>  PRE-CREATION 
>   src/java/main/org/apache/zookeeper/server/quorum/auth/QuorumAuth.java 
> PRE-CREATION 
>   
> src/java/main/org/apache/zookeeper/server/quorum/auth/QuorumAuthLearner.java 
> PRE-CREATION 
>   src/java/main/org/apache/zookeeper/server/quorum/auth/QuorumAuthServer.java 
> PRE-CREATION 
>   src/java/main/org/apache/zookeeper/server/quorum/auth/README.md 
> PRE-CREATION 
>   
> src/java/main/org/apache/zookeeper/server/quorum/auth/SaslQuorumAuthLearner.java
>  PRE-CREATION 
>   
> src/java/main/org/apache/zookeeper/server/quorum/auth/SaslQuorumAuthServer.java
>  PRE-CREATION 
>   src/java/main/org/apache/zookeeper/util/SecurityUtils.java PRE-CREATION 
>   src/java/test/data/kerberos/minikdc-krb5.conf PRE-CREATION 
>   src/java/test/data/kerberos/minikdc.ldiff PRE-CREATION 
>   src/java/test/org/apache/zookeeper/server/quorum/CnxManagerTest.java 
> 831d3ed 
>   
> src/java/test/org/apache/zookeeper/server/quorum/FLEBackwardElectionRoundTest.java
>  c1259d1 
>   src/java/test/org/apache/zookeeper/server/quorum/FLECompatibilityTest.java 
> 72e4fc9 
>   src/java/test/org/apache/zookeeper/server/quorum/FLEDontCareTest.java 
> a4c0cb0 
>   src/java/test/org/apache/zookeeper/server/quorum/FLELostMessageTest.java 
> 39a53ca 
>   src/java/test/org/apache/zookeeper/server/quorum/LearnerTest.java 2ae57ce 
>   src/java/test/org/apache/zookeeper/server/quorum/QuorumCnxManagerTest.java 
> PRE-CREATION 
>   src/java/test/org/apache/zookeeper/server/quorum/QuorumPeerTestBase.java 
> ef552db 
>   src/java/test/org/apache/zookeeper/server/quorum/Zab1_0Test.java ab8ce42 
>   
> src/java/test/org/apache/zookeeper/server/quorum/auth/KerberosSecurityTestcase.java
>  PRE-CREATION 
>   
> src/java/test/org/apache/zookeeper/server/quorum/auth/KerberosTestUtils.java 
> PRE-CREATION 
>   src/java/test/org/apache/zookeeper/server/quorum/auth/MiniKdc.java 
> PRE-CREATION 
>   src/java/test/org/apache/zookeeper/server/quorum/auth/MiniKdcTest.java 
> PRE-CREATION 
>   
> src/java/test/org/apache/zookeeper/server/quorum/auth/QuorumAuthTestBase.java 
> PRE-CREATION 
>   
> src/java/test/org/apache/zookeeper/server/quorum/auth/QuorumA

[jira] [Commented] (ZOOKEEPER-2368) Client watches are not disconnected on close

2016-07-12 Thread Camille Fournier (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15373389#comment-15373389
 ] 

Camille Fournier commented on ZOOKEEPER-2368:
-

Yeah, I am interested in the perspective of "is this the right thing to do for 
clients, what will it do to existing client libraries like Curator"

> Client watches are not disconnected on close
> 
>
> Key: ZOOKEEPER-2368
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2368
> Project: ZooKeeper
>  Issue Type: Improvement
>Affects Versions: 3.4.0, 3.5.0
>Reporter: Timothy Ward
> Fix For: 3.5.2
>
> Attachments: ZOOKEEPER-2368.patch
>
>
> If I have a ZooKeeper client connected to an ensemble then obviously I can 
> register watches. 
> If the client is disconnected (for example by a failing ensemble member) then 
> I get a disconnection event for all of my watches. If, on the other hand, my 
> client is closed then I *do not* get a disconnection event. This asymmetry 
> makes it really hard to clear up properly when using the asynchronous API, as 
> there is no way to "fail" data reads/updates when the client is closed.
> I believe that the correct behaviour should be for all watchers to receive a 
> disconnection event when the client is closed. The watchers can then respond 
> as appropriate, and can differentiate between a "server disconnect" and a 
> "client disconnect" by checking the ZooKeeper#getState() method. 
> This would not be a breaking behaviour change as Watchers are already 
> required to handle disconnection events.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


ZooKeeper_branch35_solaris - Build # 168 - Still Failing

2016-07-12 Thread Apache Jenkins Server
See https://builds.apache.org/job/ZooKeeper_branch35_solaris/168/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 419121 lines...]
[junit] 2016-07-12 17:26:42,041 [myid:] - INFO  [main:JMXEnv@146] - 
ensureOnly:[]
[junit] 2016-07-12 17:26:42,042 [myid:] - INFO  [main:ClientBase@466] - 
STARTING server
[junit] 2016-07-12 17:26:42,042 [myid:] - INFO  [main:ClientBase@386] - 
CREATING server instance 127.0.0.1:11222
[junit] 2016-07-12 17:26:42,043 [myid:] - INFO  
[main:NIOServerCnxnFactory@673] - Configuring NIO connection handler with 10s 
sessionless connection timeout, 2 selector thread(s), 16 worker threads, and 64 
kB direct buffers.
[junit] 2016-07-12 17:26:42,044 [myid:] - INFO  
[main:NIOServerCnxnFactory@686] - binding to port 0.0.0.0/0.0.0.0:11222
[junit] 2016-07-12 17:26:42,044 [myid:] - INFO  [main:ClientBase@361] - 
STARTING server instance 127.0.0.1:11222
[junit] 2016-07-12 17:26:42,045 [myid:] - INFO  [main:ZooKeeperServer@858] 
- minSessionTimeout set to 6000
[junit] 2016-07-12 17:26:42,045 [myid:] - INFO  [main:ZooKeeperServer@867] 
- maxSessionTimeout set to 6
[junit] 2016-07-12 17:26:42,045 [myid:] - INFO  [main:ZooKeeperServer@156] 
- Created server with tickTime 3000 minSessionTimeout 6000 maxSessionTimeout 
6 datadir 
/zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/ZooKeeper_branch35_solaris/branch-3.5/build/test/tmp/test28433209390412.junit.dir/version-2
 snapdir 
/zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/ZooKeeper_branch35_solaris/branch-3.5/build/test/tmp/test28433209390412.junit.dir/version-2
[junit] 2016-07-12 17:26:42,046 [myid:] - INFO  [main:FileSnap@83] - 
Reading snapshot 
/zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/ZooKeeper_branch35_solaris/branch-3.5/build/test/tmp/test28433209390412.junit.dir/version-2/snapshot.b
[junit] 2016-07-12 17:26:42,048 [myid:] - INFO  [main:FileTxnSnapLog@298] - 
Snapshotting: 0xb to 
/zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/ZooKeeper_branch35_solaris/branch-3.5/build/test/tmp/test28433209390412.junit.dir/version-2/snapshot.b
[junit] 2016-07-12 17:26:42,050 [myid:] - INFO  
[main:FourLetterWordMain@85] - connecting to 127.0.0.1 11222
[junit] 2016-07-12 17:26:42,050 [myid:] - INFO  
[NIOServerCxnFactory.AcceptThread:0.0.0.0/0.0.0.0:11222:NIOServerCnxnFactory$AcceptThread@296]
 - Accepted socket connection from /127.0.0.1:50566
[junit] 2016-07-12 17:26:42,051 [myid:] - INFO  
[NIOWorkerThread-1:NIOServerCnxn@485] - Processing stat command from 
/127.0.0.1:50566
[junit] 2016-07-12 17:26:42,051 [myid:] - INFO  
[NIOWorkerThread-1:StatCommand@49] - Stat command output
[junit] 2016-07-12 17:26:42,051 [myid:] - INFO  
[NIOWorkerThread-1:NIOServerCnxn@607] - Closed socket connection for client 
/127.0.0.1:50566 (no session established for client)
[junit] 2016-07-12 17:26:42,051 [myid:] - INFO  [main:JMXEnv@228] - 
ensureParent:[InMemoryDataTree, StandaloneServer_port]
[junit] 2016-07-12 17:26:42,053 [myid:] - INFO  [main:JMXEnv@245] - 
expect:InMemoryDataTree
[junit] 2016-07-12 17:26:42,053 [myid:] - INFO  [main:JMXEnv@249] - 
found:InMemoryDataTree 
org.apache.ZooKeeperService:name0=StandaloneServer_port11222,name1=InMemoryDataTree
[junit] 2016-07-12 17:26:42,053 [myid:] - INFO  [main:JMXEnv@245] - 
expect:StandaloneServer_port
[junit] 2016-07-12 17:26:42,053 [myid:] - INFO  [main:JMXEnv@249] - 
found:StandaloneServer_port 
org.apache.ZooKeeperService:name0=StandaloneServer_port11222
[junit] 2016-07-12 17:26:42,054 [myid:] - INFO  
[main:JUnit4ZKTestRunner$LoggedInvokeMethod@82] - Memory used 17734
[junit] 2016-07-12 17:26:42,054 [myid:] - INFO  
[main:JUnit4ZKTestRunner$LoggedInvokeMethod@87] - Number of threads 24
[junit] 2016-07-12 17:26:42,054 [myid:] - INFO  
[main:JUnit4ZKTestRunner$LoggedInvokeMethod@102] - FINISHED TEST METHOD 
testQuota
[junit] 2016-07-12 17:26:42,054 [myid:] - INFO  [main:ClientBase@543] - 
tearDown starting
[junit] 2016-07-12 17:26:42,132 [myid:] - INFO  [main:ZooKeeper@1313] - 
Session: 0x1225d21d966 closed
[junit] 2016-07-12 17:26:42,132 [myid:] - INFO  
[main-EventThread:ClientCnxn$EventThread@513] - EventThread shut down for 
session: 0x1225d21d966
[junit] 2016-07-12 17:26:42,132 [myid:] - INFO  [main:ClientBase@513] - 
STOPPING server
[junit] 2016-07-12 17:26:42,133 [myid:] - INFO  
[NIOServerCxnFactory.AcceptThread:0.0.0.0/0.0.0.0:11222:NIOServerCnxnFactory$AcceptThread@219]
 - accept thread exitted run method
[junit] 2016-07-12 17:26:42,133 [myid:] - INFO  
[ConnnectionExpirer:NIOServerCnxnFactory$ConnectionExpirerThread@583] - 
ConnnectionExpirerThread interrupted
[junit] 2016-07-12 17:26:42,133 [myid:] - INFO  
[NIOServerCxnFactory.SelectorThread-1:NIOServ

[jira] [Comment Edited] (ZOOKEEPER-2368) Client watches are not disconnected on close

2016-07-12 Thread Jordan Zimmerman (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15373274#comment-15373274
 ] 

Jordan Zimmerman edited comment on ZOOKEEPER-2368 at 7/12/16 5:13 PM:
--

When I get a chance I can run Curator's tests on this. Or maybe Timothy can do 
that. For Curator, it already handles shutdown internally for all of its 
recipes (assuming correct usage). My only concern is that the Disconnect event 
would occur out-of-band from the ZooKeeper closure (i.e. a different thread at 
a different point in time).


was (Author: randgalt):
When I get a change I can run Curator's tests on this. Or maybe Timothy can do 
that. For Curator, it already handles shutdown internally for all of its 
recipes (assuming correct usage). My only concern is that the Disconnect event 
would occur out-of-band from the ZooKeeper closure (i.e. a different thread at 
a different point in time).

> Client watches are not disconnected on close
> 
>
> Key: ZOOKEEPER-2368
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2368
> Project: ZooKeeper
>  Issue Type: Improvement
>Affects Versions: 3.4.0, 3.5.0
>Reporter: Timothy Ward
> Fix For: 3.5.2
>
> Attachments: ZOOKEEPER-2368.patch
>
>
> If I have a ZooKeeper client connected to an ensemble then obviously I can 
> register watches. 
> If the client is disconnected (for example by a failing ensemble member) then 
> I get a disconnection event for all of my watches. If, on the other hand, my 
> client is closed then I *do not* get a disconnection event. This asymmetry 
> makes it really hard to clear up properly when using the asynchronous API, as 
> there is no way to "fail" data reads/updates when the client is closed.
> I believe that the correct behaviour should be for all watchers to receive a 
> disconnection event when the client is closed. The watchers can then respond 
> as appropriate, and can differentiate between a "server disconnect" and a 
> "client disconnect" by checking the ZooKeeper#getState() method. 
> This would not be a breaking behaviour change as Watchers are already 
> required to handle disconnection events.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-2368) Client watches are not disconnected on close

2016-07-12 Thread Jordan Zimmerman (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15373274#comment-15373274
 ] 

Jordan Zimmerman commented on ZOOKEEPER-2368:
-

When I get a change I can run Curator's tests on this. Or maybe Timothy can do 
that. For Curator, it already handles shutdown internally for all of its 
recipes (assuming correct usage). My only concern is that the Disconnect event 
would occur out-of-band from the ZooKeeper closure (i.e. a different thread at 
a different point in time).

> Client watches are not disconnected on close
> 
>
> Key: ZOOKEEPER-2368
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2368
> Project: ZooKeeper
>  Issue Type: Improvement
>Affects Versions: 3.4.0, 3.5.0
>Reporter: Timothy Ward
> Fix For: 3.5.2
>
> Attachments: ZOOKEEPER-2368.patch
>
>
> If I have a ZooKeeper client connected to an ensemble then obviously I can 
> register watches. 
> If the client is disconnected (for example by a failing ensemble member) then 
> I get a disconnection event for all of my watches. If, on the other hand, my 
> client is closed then I *do not* get a disconnection event. This asymmetry 
> makes it really hard to clear up properly when using the asynchronous API, as 
> there is no way to "fail" data reads/updates when the client is closed.
> I believe that the correct behaviour should be for all watchers to receive a 
> disconnection event when the client is closed. The watchers can then respond 
> as appropriate, and can differentiate between a "server disconnect" and a 
> "client disconnect" by checking the ZooKeeper#getState() method. 
> This would not be a breaking behaviour change as Watchers are already 
> required to handle disconnection events.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-2368) Client watches are not disconnected on close

2016-07-12 Thread Camille Fournier (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15373118#comment-15373118
 ] 

Camille Fournier commented on ZOOKEEPER-2368:
-

Hey [~randgalt] (or others) can you talk through whether this makes sense from 
a client impl perspective? I'm not sure and would appreciate a set of eyes from 
someone deeper in client logic.

> Client watches are not disconnected on close
> 
>
> Key: ZOOKEEPER-2368
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2368
> Project: ZooKeeper
>  Issue Type: Improvement
>Affects Versions: 3.4.0, 3.5.0
>Reporter: Timothy Ward
> Fix For: 3.5.2
>
> Attachments: ZOOKEEPER-2368.patch
>
>
> If I have a ZooKeeper client connected to an ensemble then obviously I can 
> register watches. 
> If the client is disconnected (for example by a failing ensemble member) then 
> I get a disconnection event for all of my watches. If, on the other hand, my 
> client is closed then I *do not* get a disconnection event. This asymmetry 
> makes it really hard to clear up properly when using the asynchronous API, as 
> there is no way to "fail" data reads/updates when the client is closed.
> I believe that the correct behaviour should be for all watchers to receive a 
> disconnection event when the client is closed. The watchers can then respond 
> as appropriate, and can differentiate between a "server disconnect" and a 
> "client disconnect" by checking the ZooKeeper#getState() method. 
> This would not be a breaking behaviour change as Watchers are already 
> required to handle disconnection events.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (ZOOKEEPER-2368) Client watches are not disconnected on close

2016-07-12 Thread Camille Fournier (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2368?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Camille Fournier updated ZOOKEEPER-2368:

Issue Type: Improvement  (was: Bug)

> Client watches are not disconnected on close
> 
>
> Key: ZOOKEEPER-2368
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2368
> Project: ZooKeeper
>  Issue Type: Improvement
>Affects Versions: 3.4.0, 3.5.0
>Reporter: Timothy Ward
> Fix For: 3.5.2
>
> Attachments: ZOOKEEPER-2368.patch
>
>
> If I have a ZooKeeper client connected to an ensemble then obviously I can 
> register watches. 
> If the client is disconnected (for example by a failing ensemble member) then 
> I get a disconnection event for all of my watches. If, on the other hand, my 
> client is closed then I *do not* get a disconnection event. This asymmetry 
> makes it really hard to clear up properly when using the asynchronous API, as 
> there is no way to "fail" data reads/updates when the client is closed.
> I believe that the correct behaviour should be for all watchers to receive a 
> disconnection event when the client is closed. The watchers can then respond 
> as appropriate, and can differentiate between a "server disconnect" and a 
> "client disconnect" by checking the ZooKeeper#getState() method. 
> This would not be a breaking behaviour change as Watchers are already 
> required to handle disconnection events.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


ZooKeeper_branch34_solaris - Build # 1215 - Failure

2016-07-12 Thread Apache Jenkins Server
See https://builds.apache.org/job/ZooKeeper_branch34_solaris/1215/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 387 lines...]
at 
hudson.scm.subversion.WorkspaceUpdater$UpdateTask.delegateTo(WorkspaceUpdater.java:162)
at hudson.scm.SubversionSCM$CheckOutTask.perform(SubversionSCM.java:988)
at hudson.scm.SubversionSCM$CheckOutTask.invoke(SubversionSCM.java:969)
at hudson.scm.SubversionSCM$CheckOutTask.invoke(SubversionSCM.java:945)
at hudson.FilePath$FileCallableWrapper.call(FilePath.java:2719)
at hudson.remoting.UserRequest.perform(UserRequest.java:120)
at hudson.remoting.UserRequest.perform(UserRequest.java:48)
at hudson.remoting.Request$2.run(Request.java:326)
at 
hudson.remoting.InterceptingExecutorService$1.call(InterceptingExecutorService.java:68)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.tmatesoft.svn.core.SVNException: svn: E175002: unknown host
svn: E175002: OPTIONS request failed on 
'/repos/asf/zookeeper/branches/branch-3.4'
at 
org.tmatesoft.svn.core.internal.wc.SVNErrorManager.error(SVNErrorManager.java:106)
at 
org.tmatesoft.svn.core.internal.wc.SVNErrorManager.error(SVNErrorManager.java:90)
at 
org.tmatesoft.svn.core.internal.io.dav.http.HTTPConnection.request(HTTPConnection.java:798)
at 
org.tmatesoft.svn.core.internal.io.dav.http.HTTPConnection.request(HTTPConnection.java:398)
at 
org.tmatesoft.svn.core.internal.io.dav.http.HTTPConnection.request(HTTPConnection.java:386)
at 
org.tmatesoft.svn.core.internal.io.dav.DAVConnection.performHttpRequest(DAVConnection.java:863)
at 
org.tmatesoft.svn.core.internal.io.dav.DAVConnection.exchangeCapabilities(DAVConnection.java:699)
at 
org.tmatesoft.svn.core.internal.io.dav.DAVConnection.open(DAVConnection.java:118)
at 
org.tmatesoft.svn.core.internal.io.dav.DAVRepository.openConnection(DAVRepository.java:1049)
at 
org.tmatesoft.svn.core.internal.io.dav.DAVRepository.getRepositoryUUID(DAVRepository.java:149)
at 
org.tmatesoft.svn.core.internal.wc16.SVNBasicDelegate.createRepository(SVNBasicDelegate.java:339)
at 
org.tmatesoft.svn.core.internal.wc16.SVNBasicDelegate.createRepository(SVNBasicDelegate.java:328)
at 
org.tmatesoft.svn.core.internal.wc16.SVNUpdateClient16.update(SVNUpdateClient16.java:482)
at 
org.tmatesoft.svn.core.internal.wc16.SVNUpdateClient16.doUpdate(SVNUpdateClient16.java:364)
at 
org.tmatesoft.svn.core.internal.wc16.SVNUpdateClient16.doUpdate(SVNUpdateClient16.java:274)
at 
org.tmatesoft.svn.core.internal.wc2.old.SvnOldUpdate.run(SvnOldUpdate.java:27)
at 
org.tmatesoft.svn.core.internal.wc2.old.SvnOldUpdate.run(SvnOldUpdate.java:11)
at 
org.tmatesoft.svn.core.internal.wc2.SvnOperationRunner.run(SvnOperationRunner.java:21)
at 
org.tmatesoft.svn.core.wc2.SvnOperationFactory.run(SvnOperationFactory.java:1235)
at org.tmatesoft.svn.core.wc2.SvnOperation.run(SvnOperation.java:294)
at 
org.tmatesoft.svn.core.wc.SVNUpdateClient.doUpdate(SVNUpdateClient.java:311)
at 
org.tmatesoft.svn.core.wc.SVNUpdateClient.doUpdate(SVNUpdateClient.java:291)
at 
org.tmatesoft.svn.core.wc.SVNUpdateClient.doUpdate(SVNUpdateClient.java:387)
at 
hudson.scm.subversion.UpdateUpdater$TaskImpl.perform(UpdateUpdater.java:158)
... 13 more
Caused by: java.net.UnknownHostException: svn.apache.org
at java.net.Inet4AddressImpl.lookupAllHostAddr(Native Method)
at java.net.InetAddress$1.lookupAllHostAddr(InetAddress.java:901)
at 
java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1295)
at java.net.InetAddress.getAllByName0(InetAddress.java:1248)
at java.net.InetAddress.getAllByName(InetAddress.java:1164)
at java.net.InetAddress.getAllByName(InetAddress.java:1098)
at java.net.InetAddress.getByName(InetAddress.java:1048)
at 
org.tmatesoft.svn.core.internal.util.SVNSocketFactory.createAddres(SVNSocketFactory.java:189)
at 
org.tmatesoft.svn.core.internal.util.SVNSocketFactory.createPlainSocket(SVNSocketFactory.java:66)
at 
org.tmatesoft.svn.core.internal.io.dav.http.HTTPConnection.connect(HTTPConnection.java:282)
at 
org.tmatesoft.svn.core.internal.io.dav.http.HTTPConnection.request(HTTPConnection.java:461)
... 34 more
Recording test results
ERROR: Step ?Publish JUnit test result report? failed: Test reports were found 
but none of them are new. Did tests run? 
For example, 
/export/home/hudson/hud

[jira] [Commented] (ZOOKEEPER-2368) Client watches are not disconnected on close

2016-07-12 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15372742#comment-15372742
 ] 

Hadoop QA commented on ZOOKEEPER-2368:
--

+1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12817393/ZOOKEEPER-2368.patch
  against trunk revision 1750739.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs (version 2.0.3) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/3273//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/3273//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/3273//console

This message is automatically generated.

> Client watches are not disconnected on close
> 
>
> Key: ZOOKEEPER-2368
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2368
> Project: ZooKeeper
>  Issue Type: Bug
>Affects Versions: 3.4.0, 3.5.0
>Reporter: Timothy Ward
> Fix For: 3.5.2
>
> Attachments: ZOOKEEPER-2368.patch
>
>
> If I have a ZooKeeper client connected to an ensemble then obviously I can 
> register watches. 
> If the client is disconnected (for example by a failing ensemble member) then 
> I get a disconnection event for all of my watches. If, on the other hand, my 
> client is closed then I *do not* get a disconnection event. This asymmetry 
> makes it really hard to clear up properly when using the asynchronous API, as 
> there is no way to "fail" data reads/updates when the client is closed.
> I believe that the correct behaviour should be for all watchers to receive a 
> disconnection event when the client is closed. The watchers can then respond 
> as appropriate, and can differentiate between a "server disconnect" and a 
> "client disconnect" by checking the ZooKeeper#getState() method. 
> This would not be a breaking behaviour change as Watchers are already 
> required to handle disconnection events.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Success: ZOOKEEPER-2368 PreCommit Build #3273

2016-07-12 Thread Apache Jenkins Server
Jira: https://issues.apache.org/jira/browse/ZOOKEEPER-2368
Build: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/3273/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 442785 lines...]
 [exec]   
http://issues.apache.org/jira/secure/attachment/12817393/ZOOKEEPER-2368.patch
 [exec]   against trunk revision 1750739.
 [exec] 
 [exec] +1 @author.  The patch does not contain any @author tags.
 [exec] 
 [exec] +1 tests included.  The patch appears to include 3 new or 
modified tests.
 [exec] 
 [exec] +1 javadoc.  The javadoc tool did not generate any warning 
messages.
 [exec] 
 [exec] +1 javac.  The applied patch does not increase the total number 
of javac compiler warnings.
 [exec] 
 [exec] +1 findbugs.  The patch does not introduce any new Findbugs 
(version 2.0.3) warnings.
 [exec] 
 [exec] +1 release audit.  The applied patch does not increase the 
total number of release audit warnings.
 [exec] 
 [exec] +1 core tests.  The patch passed core unit tests.
 [exec] 
 [exec] +1 contrib tests.  The patch passed contrib unit tests.
 [exec] 
 [exec] Test results: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/3273//testReport/
 [exec] Findbugs warnings: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/3273//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
 [exec] Console output: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/3273//console
 [exec] 
 [exec] This message is automatically generated.
 [exec] 
 [exec] 
 [exec] 
==
 [exec] 
==
 [exec] Adding comment to Jira.
 [exec] 
==
 [exec] 
==
 [exec] 
 [exec] 
 [exec] Comment added.
 [exec] 7a0f94e9132a42591bfdca5a116c49bbb411b749 logged out
 [exec] 
 [exec] 
 [exec] 
==
 [exec] 
==
 [exec] Finished build.
 [exec] 
==
 [exec] 
==
 [exec] 
 [exec] 

BUILD SUCCESSFUL
Total time: 20 minutes 21 seconds
Archiving artifacts
Setting 
LATEST1_7_HOME=/home/jenkins/jenkins-slave/tools/hudson.model.JDK/latest1.7
Recording test results
Setting 
LATEST1_7_HOME=/home/jenkins/jenkins-slave/tools/hudson.model.JDK/latest1.7
Setting 
LATEST1_7_HOME=/home/jenkins/jenkins-slave/tools/hudson.model.JDK/latest1.7
[description-setter] Description set: ZOOKEEPER-2368
Email was triggered for: Success
Sending email for trigger: Success
Setting 
LATEST1_7_HOME=/home/jenkins/jenkins-slave/tools/hudson.model.JDK/latest1.7
Setting 
LATEST1_7_HOME=/home/jenkins/jenkins-slave/tools/hudson.model.JDK/latest1.7
Setting 
LATEST1_7_HOME=/home/jenkins/jenkins-slave/tools/hudson.model.JDK/latest1.7
Setting 
LATEST1_7_HOME=/home/jenkins/jenkins-slave/tools/hudson.model.JDK/latest1.7



###
## FAILED TESTS (if any) 
##
All tests passed

[jira] [Updated] (ZOOKEEPER-2368) Client watches are not disconnected on close

2016-07-12 Thread Timothy Ward (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2368?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Timothy Ward updated ZOOKEEPER-2368:

Attachment: ZOOKEEPER-2368.patch

> Client watches are not disconnected on close
> 
>
> Key: ZOOKEEPER-2368
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2368
> Project: ZooKeeper
>  Issue Type: Bug
>Affects Versions: 3.4.0, 3.5.0
>Reporter: Timothy Ward
> Fix For: 3.5.2
>
> Attachments: ZOOKEEPER-2368.patch
>
>
> If I have a ZooKeeper client connected to an ensemble then obviously I can 
> register watches. 
> If the client is disconnected (for example by a failing ensemble member) then 
> I get a disconnection event for all of my watches. If, on the other hand, my 
> client is closed then I *do not* get a disconnection event. This asymmetry 
> makes it really hard to clear up properly when using the asynchronous API, as 
> there is no way to "fail" data reads/updates when the client is closed.
> I believe that the correct behaviour should be for all watchers to receive a 
> disconnection event when the client is closed. The watchers can then respond 
> as appropriate, and can differentiate between a "server disconnect" and a 
> "client disconnect" by checking the ZooKeeper#getState() method. 
> This would not be a breaking behaviour change as Watchers are already 
> required to handle disconnection events.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


ZooKeeper_branch35_jdk7 - Build # 582 - Still Failing

2016-07-12 Thread Apache Jenkins Server
See https://builds.apache.org/job/ZooKeeper_branch35_jdk7/582/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 447902 lines...]
 [exec] Log Message Received: [2016-07-12 
09:40:07,824:29878(0x2b1406922540):ZOO_INFO@testLogCallbackInit@993: 
testLogCallbackInit #8]
 [exec] Log Message Received: [2016-07-12 
09:40:07,824:29878(0x2b1406922540):ZOO_INFO@testLogCallbackInit@993: 
testLogCallbackInit #9]
 [exec] Log Message Received: [2016-07-12 
09:40:07,824:29878(0x2b1406922540):ZOO_INFO@zookeeper_close@3257: Closing 
zookeeper sessionId=0x19d49f8000e to [127.0.0.1:22181]
 [exec] ]
 [exec]  : elapsed 1001 : OK
 [exec] Zookeeper_simpleSystem::testLogCallbackClearLog Message Received: 
[2016-07-12 09:40:07,825:29878(0x2b1406922540):ZOO_INFO@log_env@1027: Client 
environment:zookeeper.version=zookeeper C client 3.5.2]
 [exec] Log Message Received: [2016-07-12 
09:40:07,825:29878(0x2b1406922540):ZOO_INFO@log_env@1031: Client 
environment:host.name=jenkins-test-2ef]
 [exec] Log Message Received: [2016-07-12 
09:40:07,825:29878(0x2b1406922540):ZOO_INFO@log_env@1038: Client 
environment:os.name=Linux]
 [exec] Log Message Received: [2016-07-12 
09:40:07,825:29878(0x2b1406922540):ZOO_INFO@log_env@1039: Client 
environment:os.arch=3.13.0-30-generic]
 [exec] Log Message Received: [2016-07-12 
09:40:07,825:29878(0x2b1406922540):ZOO_INFO@log_env@1040: Client 
environment:os.version=#54-Ubuntu SMP Mon Jun 9 22:45:01 UTC 2014]
 [exec] Log Message Received: [2016-07-12 
09:40:07,825:29878(0x2b1406922540):ZOO_INFO@log_env@1048: Client 
environment:user.name=jenkins]
 [exec] Log Message Received: [2016-07-12 
09:40:07,825:29878(0x2b1406922540):ZOO_INFO@log_env@1056: Client 
environment:user.home=/home/jenkins]
 [exec] Log Message Received: [2016-07-12 
09:40:07,825:29878(0x2b1406922540):ZOO_INFO@log_env@1068: Client 
environment:user.dir=/home/jenkins/jenkins-slave/workspace/ZooKeeper_branch35_jdk7/branch-3.5/build/test/test-cppunit]
 [exec] Log Message Received: [2016-07-12 
09:40:07,825:29878(0x2b1406922540):ZOO_INFO@zookeeper_init_internal@: 
Initiating client connection, host=127.0.0.1:22181 sessionTimeout=1 
watcher=0x45d2a0 sessionId=0 sessionPasswd= context=0x7fff2bb26010 
flags=0]
 [exec] Log Message Received: [2016-07-12 
09:40:07,826:29878(0x2b1408fb0700):ZOO_INFO@check_events@2357: initiated 
connection to server [127.0.0.1:22181]]
 [exec] Log Message Received: [2016-07-12 
09:40:07,828:29878(0x2b1408fb0700):ZOO_INFO@check_events@2409: session 
establishment complete on server [127.0.0.1:22181], 
sessionId=0x19d49f8000f, negotiated timeout=1 ]
 [exec]  : elapsed 1001 : OK
 [exec] Zookeeper_simpleSystem::testAsyncWatcherAutoReset ZooKeeper server 
started : elapsed 10569 : OK
 [exec] Zookeeper_simpleSystem::testDeserializeString : elapsed 0 : OK
 [exec] Zookeeper_simpleSystem::testFirstServerDown : elapsed 1001 : OK
 [exec] Zookeeper_simpleSystem::testNullData : elapsed 1018 : OK
 [exec] Zookeeper_simpleSystem::testIPV6 : elapsed 1003 : OK
 [exec] Zookeeper_simpleSystem::testCreate : elapsed 1005 : OK
 [exec] Zookeeper_simpleSystem::testPath : elapsed 1010 : OK
 [exec] Zookeeper_simpleSystem::testPathValidation : elapsed 1027 : OK
 [exec] Zookeeper_simpleSystem::testPing : elapsed 17145 : OK
 [exec] Zookeeper_simpleSystem::testAcl : elapsed 1010 : OK
 [exec] Zookeeper_simpleSystem::testChroot : elapsed 3025 : OK
 [exec] Zookeeper_simpleSystem::testAuth ZooKeeper server started ZooKeeper 
server started : elapsed 31138 : OK
 [exec] Zookeeper_simpleSystem::testHangingClient : elapsed 1019 : OK
 [exec] Zookeeper_simpleSystem::testWatcherAutoResetWithGlobal ZooKeeper 
server started ZooKeeper server started ZooKeeper server started : elapsed 
15671 : OK
 [exec] Zookeeper_simpleSystem::testWatcherAutoResetWithLocal ZooKeeper 
server started ZooKeeper server started ZooKeeper server started : elapsed 
15749 : OK
 [exec] Zookeeper_simpleSystem::testGetChildren2 : elapsed 1022 : OK
 [exec] Zookeeper_simpleSystem::testLastZxid : elapsed 4510 : OK
 [exec] Zookeeper_simpleSystem::testRemoveWatchers ZooKeeper server started 
: elapsed 4686 : OK
 [exec] Zookeeper_readOnly::testReadOnly : elapsed 4288 : OK
 [exec] 
/home/jenkins/jenkins-slave/workspace/ZooKeeper_branch35_jdk7/branch-3.5/src/c/tests/TestReconfig.cc:183:
 Assertion: equality assertion failed [Expected: 1, Actual  : 0]
 [exec] Failures !!!
 [exec] Run: 72   Failure total: 1   Failures: 1   Errors: 0
 [exec] FAIL: zktest-mt
 [exec] make[1]: *** [check-TESTS] Error 1
 [exec] make: *** [check-am] Error 2
 [exec] ==
 [exec] 1 of 2 tests failed
 [exec] Please report to u...@zookeeper.apache.org
 [exe

ZooKeeper-trunk-solaris - Build # 1227 - Still Failing

2016-07-12 Thread Apache Jenkins Server
See https://builds.apache.org/job/ZooKeeper-trunk-solaris/1227/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 387 lines...]
at 
hudson.scm.subversion.WorkspaceUpdater$UpdateTask.delegateTo(WorkspaceUpdater.java:162)
at hudson.scm.SubversionSCM$CheckOutTask.perform(SubversionSCM.java:988)
at hudson.scm.SubversionSCM$CheckOutTask.invoke(SubversionSCM.java:969)
at hudson.scm.SubversionSCM$CheckOutTask.invoke(SubversionSCM.java:945)
at hudson.FilePath$FileCallableWrapper.call(FilePath.java:2719)
at hudson.remoting.UserRequest.perform(UserRequest.java:120)
at hudson.remoting.UserRequest.perform(UserRequest.java:48)
at hudson.remoting.Request$2.run(Request.java:326)
at 
hudson.remoting.InterceptingExecutorService$1.call(InterceptingExecutorService.java:68)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.tmatesoft.svn.core.SVNException: svn: E175002: unknown host
svn: E175002: OPTIONS request failed on '/repos/asf/zookeeper/trunk'
at 
org.tmatesoft.svn.core.internal.wc.SVNErrorManager.error(SVNErrorManager.java:106)
at 
org.tmatesoft.svn.core.internal.wc.SVNErrorManager.error(SVNErrorManager.java:90)
at 
org.tmatesoft.svn.core.internal.io.dav.http.HTTPConnection.request(HTTPConnection.java:798)
at 
org.tmatesoft.svn.core.internal.io.dav.http.HTTPConnection.request(HTTPConnection.java:398)
at 
org.tmatesoft.svn.core.internal.io.dav.http.HTTPConnection.request(HTTPConnection.java:386)
at 
org.tmatesoft.svn.core.internal.io.dav.DAVConnection.performHttpRequest(DAVConnection.java:863)
at 
org.tmatesoft.svn.core.internal.io.dav.DAVConnection.exchangeCapabilities(DAVConnection.java:699)
at 
org.tmatesoft.svn.core.internal.io.dav.DAVConnection.open(DAVConnection.java:118)
at 
org.tmatesoft.svn.core.internal.io.dav.DAVRepository.openConnection(DAVRepository.java:1049)
at 
org.tmatesoft.svn.core.internal.io.dav.DAVRepository.getRepositoryUUID(DAVRepository.java:149)
at 
org.tmatesoft.svn.core.internal.wc16.SVNBasicDelegate.createRepository(SVNBasicDelegate.java:339)
at 
org.tmatesoft.svn.core.internal.wc16.SVNBasicDelegate.createRepository(SVNBasicDelegate.java:328)
at 
org.tmatesoft.svn.core.internal.wc16.SVNUpdateClient16.update(SVNUpdateClient16.java:482)
at 
org.tmatesoft.svn.core.internal.wc16.SVNUpdateClient16.doUpdate(SVNUpdateClient16.java:364)
at 
org.tmatesoft.svn.core.internal.wc16.SVNUpdateClient16.doUpdate(SVNUpdateClient16.java:274)
at 
org.tmatesoft.svn.core.internal.wc2.old.SvnOldUpdate.run(SvnOldUpdate.java:27)
at 
org.tmatesoft.svn.core.internal.wc2.old.SvnOldUpdate.run(SvnOldUpdate.java:11)
at 
org.tmatesoft.svn.core.internal.wc2.SvnOperationRunner.run(SvnOperationRunner.java:21)
at 
org.tmatesoft.svn.core.wc2.SvnOperationFactory.run(SvnOperationFactory.java:1235)
at org.tmatesoft.svn.core.wc2.SvnOperation.run(SvnOperation.java:294)
at 
org.tmatesoft.svn.core.wc.SVNUpdateClient.doUpdate(SVNUpdateClient.java:311)
at 
org.tmatesoft.svn.core.wc.SVNUpdateClient.doUpdate(SVNUpdateClient.java:291)
at 
org.tmatesoft.svn.core.wc.SVNUpdateClient.doUpdate(SVNUpdateClient.java:387)
at 
hudson.scm.subversion.UpdateUpdater$TaskImpl.perform(UpdateUpdater.java:158)
... 13 more
Caused by: java.net.UnknownHostException: svn.apache.org
at java.net.Inet4AddressImpl.lookupAllHostAddr(Native Method)
at java.net.InetAddress$1.lookupAllHostAddr(InetAddress.java:901)
at 
java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1295)
at java.net.InetAddress.getAllByName0(InetAddress.java:1248)
at java.net.InetAddress.getAllByName(InetAddress.java:1164)
at java.net.InetAddress.getAllByName(InetAddress.java:1098)
at java.net.InetAddress.getByName(InetAddress.java:1048)
at 
org.tmatesoft.svn.core.internal.util.SVNSocketFactory.createAddres(SVNSocketFactory.java:189)
at 
org.tmatesoft.svn.core.internal.util.SVNSocketFactory.createPlainSocket(SVNSocketFactory.java:66)
at 
org.tmatesoft.svn.core.internal.io.dav.http.HTTPConnection.connect(HTTPConnection.java:282)
at 
org.tmatesoft.svn.core.internal.io.dav.http.HTTPConnection.request(HTTPConnection.java:461)
... 34 more
Recording test results
ERROR: Step ?Publish JUnit test result report? failed: Test reports were found 
but none of them are new. Did tests run? 
For example, 
/export/home/hudson/hudson-slave/workspac