from:"Henry Robinson"

[
https://issues.apache.org/jira/browse/ZOOKEEPER-769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Henry Robinson updated ZOOKEEPER-769:
-

Status: Resolved (was: Patch Available)
Resolution: Fixed

I just committed this - thanks Sergey!

Leader can treat observers as quorum members

Key: ZOOKEEPER-769
URL: https://issues.apache.org/jira/browse/ZOOKEEPER-769
Project: Zookeeper
Issue Type: Bug
Affects Versions: 3.3.0
Environment: Ubuntu Karmic x64
Reporter: Sergey Doroshenko
Assignee: Sergey Doroshenko
Fix For: 3.4.0

Attachments: follower.log, leader.log, observer.log, warning.patch,
zoo1.cfg, ZOOKEEPER-769.patch, ZOOKEEPER-769.patch

In short: it seems leader can treat observers as quorum members.
Steps to repro:
1. Server configuration: 3 voters, 2 observers (attached).
2. Bring up 2 voters and one observer. It's enough for quorum.
3. Shut down the one from the quorum who is the follower.
As I understand, expected result is that leader will start a new election
round so that to regain quorum.
But the real situation is that it just says goodbye to that follower, and is
still operable. (When I'm shutting down 3rd one -- observer -- leader starts
trying to regain a quorum).
(Expectedly, if on step 3 we shut down the leader, not the follower,
remaining follower starta new leader election, as it should be).

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Re: [PATCH] javaclient: validate sessionTimeout field at ZooKeeper init (JIRA ZOOKEEPER-776)

2010-05-21 Thread Henry Robinson

Hi Greg -

Thanks very much for contributing! We've got some guidelines here:
http://wiki.apache.org/hadoop/ZooKeeper/HowToContribute - let me know if
they're not clear.

The main thing for you to do is to attach your patch to the JIRA and click
the 'Licensed for inclusion into Apache projects' button when you do. You
can do this by clicking 'Attach patch' on the JIRA itself. Once you've done
that, please click 'Submit patch' to kick off our automated QA procedures.

Assuming all goes well, a committer will pick up the baton from there and
get the patch into trunk (or let you know if they think changes are
necessary).

Thanks!

Henry

On 21 May 2010 12:22, Gregory Haskins gregory.hask...@gmail.com wrote:

 Hi All,

 First patch submission for me.  If there are any patch submission
 guidelines I should follow, kindly point me at them and accept my
 apology if this approach violates any established procedures.  I didn't
 find anything obvious on the site wiki, so I just used some practices
 learned on other projects.

 -Greg

 

 commit 840f56d388582e1df39f7513aa7f4d4ce0610718
 Author: Gregory Haskins ghask...@novell.com
 Date:   Fri May 21 14:58:14 2010 -0400

javaclient: validate sessionTimeout field at ZooKeeper init

JIRA ZOOKEEPER-776 describes the following problem:

passing in a 0 sessionTimeout to ZooKeeper() constructor leads to
 errors
in subsequent operations. It would be ideal to capture this
 configuration
error at the source by throwing something like an IllegalArgument
 exception
when the bogus sessionTimeout is specified, instead of later when it is
utilized.

This patch is a proposal to fix the problem referenced above.

Applies to svn-id: 946074

Signed-off-by: Gregory Haskins ghask...@novell.com

 diff --git a/src/java/main/org/apache/zookeeper/ClientCnxn.java
 b/src/java/main/
 index 8eb227d..682811b 100644
 --- a/src/java/main/org/apache/zookeeper/ClientCnxn.java
 +++ b/src/java/main/org/apache/zookeeper/ClientCnxn.java
 @@ -353,6 +353,11 @@ public class ClientCnxn {
 this.sessionId = sessionId;
 this.sessionPasswd = sessionPasswd;

 +   if (sessionTimeout = 0) {
 +   throw new IOException(sessionTimeout  + sessionTimeout
 + +  is not valid);
 +   }
 +
 // parse out chroot, if any
 int off = hosts.indexOf('/');
 if (off = 0) {




-- 
Henry Robinson
Software Engineer
Cloudera
415-994-6679

[jira] Commented: (ZOOKEEPER-776) API should sanity check sessionTimeout argument


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12870152#action_12870152
 ] 

Henry Robinson commented on ZOOKEEPER-776:
--

Thanks Greg - can you generate your patch from git with --no-prefix, to make it 
svn compatible?

 API should sanity check sessionTimeout argument
 ---

 Key: ZOOKEEPER-776
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-776
 Project: Zookeeper
  Issue Type: Improvement
  Components: c client, java client
Affects Versions: 3.2.2, 3.3.0, 3.3.1
 Environment: OSX 10.6.3, JVM 1.6.0-20
Reporter: Gregory Haskins
Priority: Minor
 Fix For: 3.4.0

 Attachments: zookeeper-776-fix.patch


 passing in a 0 sessionTimeout to ZooKeeper() constructor leads to errors in 
 subsequent operations.  It would be ideal to capture this configuration error 
 at the source by throwing something like an IllegalArgument exception when 
 the bogus sessionTimeout is specified, instead of later when it is utilized.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (ZOOKEEPER-776) API should sanity check sessionTimeout argument


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12870164#action_12870164
 ] 

Henry Robinson commented on ZOOKEEPER-776:
--

Cancelling the patch is fine but there's no need to delete it - Hudson will 
always figure out what the latest patch is and it's good to see how a ticket 
evolved.

Tests will also help :)

 API should sanity check sessionTimeout argument
 ---

 Key: ZOOKEEPER-776
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-776
 Project: Zookeeper
  Issue Type: Improvement
  Components: c client, java client
Affects Versions: 3.2.2, 3.3.0, 3.3.1
 Environment: OSX 10.6.3, JVM 1.6.0-20
Reporter: Gregory Haskins
Priority: Minor
 Fix For: 3.4.0

 Attachments: zookeeper-776-fix.patch


 passing in a 0 sessionTimeout to ZooKeeper() constructor leads to errors in 
 subsequent operations.  It would be ideal to capture this configuration error 
 at the source by throwing something like an IllegalArgument exception when 
 the bogus sessionTimeout is specified, instead of later when it is utilized.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (ZOOKEEPER-776) API should sanity check sessionTimeout argument


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12870179#action_12870179
 ] 

Henry Robinson commented on ZOOKEEPER-776:
--

Greg - 

Don't worry - you should have seen the hash I made of my first patch!

Hudson is misbehaving at the moment, so I'm not convinced that the test 
failures are as a result of your patch. You don't need to do anything right now 
- I'll take a look and update this ticket once I know what's going on.

cheers,
Henry

 API should sanity check sessionTimeout argument
 ---

 Key: ZOOKEEPER-776
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-776
 Project: Zookeeper
  Issue Type: Improvement
  Components: c client, java client
Affects Versions: 3.2.2, 3.3.0, 3.3.1
 Environment: OSX 10.6.3, JVM 1.6.0-20
Reporter: Gregory Haskins
Priority: Minor
 Fix For: 3.4.0

 Attachments: zookeeper-776-fix.patch


 passing in a 0 sessionTimeout to ZooKeeper() constructor leads to errors in 
 subsequent operations.  It would be ideal to capture this configuration error 
 at the source by throwing something like an IllegalArgument exception when 
 the bogus sessionTimeout is specified, instead of later when it is utilized.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (ZOOKEEPER-769) Leader can treat observers as quorum members

2010-05-20 Thread Henry Robinson (JIRA)


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Henry Robinson updated ZOOKEEPER-769:
-

Status: Open  (was: Patch Available)

 Leader can treat observers as quorum members
 

 Key: ZOOKEEPER-769
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-769
 Project: Zookeeper
  Issue Type: Bug
Affects Versions: 3.3.0
 Environment: Ubuntu Karmic x64
Reporter: Sergey Doroshenko
Assignee: Sergey Doroshenko
 Fix For: 3.4.0

 Attachments: follower.log, leader.log, observer.log, warning.patch, 
 zoo1.cfg, ZOOKEEPER-769.patch, ZOOKEEPER-769.patch


 In short: it seems leader can treat observers as quorum members.
 Steps to repro:
 1. Server configuration: 3 voters, 2 observers (attached).
 2. Bring up 2 voters and one observer. It's enough for quorum.
 3. Shut down the one from the quorum who is the follower.
 As I understand, expected result is that leader will start a new election 
 round so that to regain quorum.
 But the real situation is that it just says goodbye to that follower, and is 
 still operable. (When I'm shutting down 3rd one -- observer -- leader starts 
 trying to regain a quorum).
 (Expectedly, if on step 3 we shut down the leader, not the follower, 
 remaining follower starta new leader election, as it should be).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (ZOOKEEPER-769) Leader can treat observers as quorum members

2010-05-20 Thread Henry Robinson (JIRA)


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Henry Robinson updated ZOOKEEPER-769:
-

  Status: Patch Available  (was: Open)
Hadoop Flags: [Reviewed]

hudson? hello?

 Leader can treat observers as quorum members
 

 Key: ZOOKEEPER-769
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-769
 Project: Zookeeper
  Issue Type: Bug
Affects Versions: 3.3.0
 Environment: Ubuntu Karmic x64
Reporter: Sergey Doroshenko
Assignee: Sergey Doroshenko
 Fix For: 3.4.0

 Attachments: follower.log, leader.log, observer.log, warning.patch, 
 zoo1.cfg, ZOOKEEPER-769.patch, ZOOKEEPER-769.patch


 In short: it seems leader can treat observers as quorum members.
 Steps to repro:
 1. Server configuration: 3 voters, 2 observers (attached).
 2. Bring up 2 voters and one observer. It's enough for quorum.
 3. Shut down the one from the quorum who is the follower.
 As I understand, expected result is that leader will start a new election 
 round so that to regain quorum.
 But the real situation is that it just says goodbye to that follower, and is 
 still operable. (When I'm shutting down 3rd one -- observer -- leader starts 
 trying to regain a quorum).
 (Expectedly, if on step 3 we shut down the leader, not the follower, 
 remaining follower starta new leader election, as it should be).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (ZOOKEEPER-769) Leader can treat observers as quorum members

2010-05-20 Thread Henry Robinson (JIRA)

[
https://issues.apache.org/jira/browse/ZOOKEEPER-769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12869822#action_12869822
]

Henry Robinson commented on ZOOKEEPER-769:
--

Failures do not look related to this patch (although I could be mistaken).
ZkDatabaseCorruptionTest is the most recent broken test - passes fine for me
locally?

Leader can treat observers as quorum members

Attachments: follower.log, leader.log, observer.log, warning.patch,
zoo1.cfg, ZOOKEEPER-769.patch, ZOOKEEPER-769.patch

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (ZOOKEEPER-769) Leader can treat observers as quorum members

[
https://issues.apache.org/jira/browse/ZOOKEEPER-769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12868780#action_12868780
]

Henry Robinson commented on ZOOKEEPER-769:
--

Sergey - sorry for the delay. It's on me to review this patch, and then I'll
commit it.

Thanks for your patience!

Henry

Leader can treat observers as quorum members

Attachments: follower.log, leader.log, observer.log, warning.patch,
zoo1.cfg, ZOOKEEPER-769.patch

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (ZOOKEEPER-769) Leader can treat observers as quorum members

[
https://issues.apache.org/jira/browse/ZOOKEEPER-769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Henry Robinson updated ZOOKEEPER-769:
-

Attachment: ZOOKEEPER-769.patch

I made a few small changes to your patch to make the logic a little easier to
follow. Take a look and let me know if you think this is ok, otherwise I'll
commit the patch tomorrow. Thanks!

Henry

Leader can treat observers as quorum members

Attachments: follower.log, leader.log, observer.log, warning.patch,
zoo1.cfg, ZOOKEEPER-769.patch, ZOOKEEPER-769.patch

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (ZOOKEEPER-769) Leader can treat observers as quorum members


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Henry Robinson updated ZOOKEEPER-769:
-

Status: Open  (was: Patch Available)

 Leader can treat observers as quorum members
 

 Key: ZOOKEEPER-769
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-769
 Project: Zookeeper
  Issue Type: Bug
Affects Versions: 3.3.0
 Environment: Ubuntu Karmic x64
Reporter: Sergey Doroshenko
Assignee: Sergey Doroshenko
 Fix For: 3.4.0

 Attachments: follower.log, leader.log, observer.log, warning.patch, 
 zoo1.cfg, ZOOKEEPER-769.patch, ZOOKEEPER-769.patch


 In short: it seems leader can treat observers as quorum members.
 Steps to repro:
 1. Server configuration: 3 voters, 2 observers (attached).
 2. Bring up 2 voters and one observer. It's enough for quorum.
 3. Shut down the one from the quorum who is the follower.
 As I understand, expected result is that leader will start a new election 
 round so that to regain quorum.
 But the real situation is that it just says goodbye to that follower, and is 
 still operable. (When I'm shutting down 3rd one -- observer -- leader starts 
 trying to regain a quorum).
 (Expectedly, if on step 3 we shut down the leader, not the follower, 
 remaining follower starta new leader election, as it should be).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (ZOOKEEPER-769) Leader can treat observers as quorum members


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Henry Robinson updated ZOOKEEPER-769:
-

Status: Patch Available  (was: Open)

 Leader can treat observers as quorum members
 

 Key: ZOOKEEPER-769
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-769
 Project: Zookeeper
  Issue Type: Bug
Affects Versions: 3.3.0
 Environment: Ubuntu Karmic x64
Reporter: Sergey Doroshenko
Assignee: Sergey Doroshenko
 Fix For: 3.4.0

 Attachments: follower.log, leader.log, observer.log, warning.patch, 
 zoo1.cfg, ZOOKEEPER-769.patch, ZOOKEEPER-769.patch


 In short: it seems leader can treat observers as quorum members.
 Steps to repro:
 1. Server configuration: 3 voters, 2 observers (attached).
 2. Bring up 2 voters and one observer. It's enough for quorum.
 3. Shut down the one from the quorum who is the follower.
 As I understand, expected result is that leader will start a new election 
 round so that to regain quorum.
 But the real situation is that it just says goodbye to that follower, and is 
 still operable. (When I'm shutting down 3rd one -- observer -- leader starts 
 trying to regain a quorum).
 (Expectedly, if on step 3 we shut down the leader, not the follower, 
 remaining follower starta new leader election, as it should be).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Assigned: (ZOOKEEPER-772) zkpython segfaults when watcher from async get children is invoked.


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-772?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Henry Robinson reassigned ZOOKEEPER-772:


Assignee: Henry Robinson

 zkpython segfaults when watcher from async get children is invoked.
 ---

 Key: ZOOKEEPER-772
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-772
 Project: Zookeeper
  Issue Type: Bug
  Components: contrib-bindings
 Environment: ubuntu lucid (10.04) / zk trunk
Reporter: Kapil Thangavelu
Assignee: Henry Robinson
 Attachments: asyncgetchildren.py, zkpython-testasyncgetchildren.diff


 When utilizing the zkpython async get children api with a watch, i 
 consistently get segfaults when the watcher is invoked to process events. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (ZOOKEEPER-772) zkpython segfaults when watcher from async get children is invoked.


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-772?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Henry Robinson updated ZOOKEEPER-772:
-

Attachment: ZOOKEEPER-772.patch

Bug was simple when I got round to looking - was incorrectly reusing a watcher 
that was getting deallocated before getting called.

 zkpython segfaults when watcher from async get children is invoked.
 ---

 Key: ZOOKEEPER-772
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-772
 Project: Zookeeper
  Issue Type: Bug
  Components: contrib-bindings
 Environment: ubuntu lucid (10.04) / zk trunk
Reporter: Kapil Thangavelu
Assignee: Henry Robinson
 Attachments: asyncgetchildren.py, zkpython-testasyncgetchildren.diff, 
 ZOOKEEPER-772.patch


 When utilizing the zkpython async get children api with a watch, i 
 consistently get segfaults when the watcher is invoked to process events. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (ZOOKEEPER-772) zkpython segfaults when watcher from async get children is invoked.


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-772?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Henry Robinson updated ZOOKEEPER-772:
-

Status: Patch Available  (was: Open)

 zkpython segfaults when watcher from async get children is invoked.
 ---

 Key: ZOOKEEPER-772
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-772
 Project: Zookeeper
  Issue Type: Bug
  Components: contrib-bindings
 Environment: ubuntu lucid (10.04) / zk trunk
Reporter: Kapil Thangavelu
Assignee: Henry Robinson
 Attachments: asyncgetchildren.py, zkpython-testasyncgetchildren.diff, 
 ZOOKEEPER-772.patch


 When utilizing the zkpython async get children api with a watch, i 
 consistently get segfaults when the watcher is invoked to process events. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (ZOOKEEPER-772) zkpython segfaults when watcher from async get children is invoked.


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-772?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Henry Robinson updated ZOOKEEPER-772:
-

Status: Open  (was: Patch Available)

 zkpython segfaults when watcher from async get children is invoked.
 ---

 Key: ZOOKEEPER-772
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-772
 Project: Zookeeper
  Issue Type: Bug
  Components: contrib-bindings
 Environment: ubuntu lucid (10.04) / zk trunk
Reporter: Kapil Thangavelu
Assignee: Henry Robinson
 Attachments: asyncgetchildren.py, zkpython-testasyncgetchildren.diff, 
 ZOOKEEPER-772.patch, ZOOKEEPER-772.patch


 When utilizing the zkpython async get children api with a watch, i 
 consistently get segfaults when the watcher is invoked to process events. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (ZOOKEEPER-772) zkpython segfaults when watcher from async get children is invoked.


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-772?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Henry Robinson updated ZOOKEEPER-772:
-

Status: Patch Available  (was: Open)

 zkpython segfaults when watcher from async get children is invoked.
 ---

 Key: ZOOKEEPER-772
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-772
 Project: Zookeeper
  Issue Type: Bug
  Components: contrib-bindings
 Environment: ubuntu lucid (10.04) / zk trunk
Reporter: Kapil Thangavelu
Assignee: Henry Robinson
 Attachments: asyncgetchildren.py, zkpython-testasyncgetchildren.diff, 
 ZOOKEEPER-772.patch, ZOOKEEPER-772.patch


 When utilizing the zkpython async get children api with a watch, i 
 consistently get segfaults when the watcher is invoked to process events. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (ZOOKEEPER-772) zkpython segfaults when watcher from async get children is invoked.


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-772?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Henry Robinson updated ZOOKEEPER-772:
-

Attachment: ZOOKEEPER-772.patch

--no-prefix, predictably.

 zkpython segfaults when watcher from async get children is invoked.
 ---

 Key: ZOOKEEPER-772
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-772
 Project: Zookeeper
  Issue Type: Bug
  Components: contrib-bindings
 Environment: ubuntu lucid (10.04) / zk trunk
Reporter: Kapil Thangavelu
Assignee: Henry Robinson
 Attachments: asyncgetchildren.py, zkpython-testasyncgetchildren.diff, 
 ZOOKEEPER-772.patch, ZOOKEEPER-772.patch


 When utilizing the zkpython async get children api with a watch, i 
 consistently get segfaults when the watcher is invoked to process events. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Re: [VOTE] Release ZooKeeper 3.3.1 (candidate 0)

2010-05-12 Thread Henry Robinson

+1, Java tests pass for me, as do Python ones.

Henry

On 11 May 2010 22:32, Patrick Hunt ph...@apache.org wrote:

 +1, tests pass for me, also verified that nc/zktop worked properly on a
 real cluster (4letter word fix).

 Patrick


 On 05/07/2010 11:25 AM, Patrick Hunt wrote:

 I've created a candidate build for ZooKeeper 3.3.1. This is a bug fix
 release addressing seventeen issues (one critical) -- see the release
 notes for details.

 *** Please download, test and VOTE before the
 *** vote closes 11am pacific time, Wednesday, May 12.***

 http://people.apache.org/~phunt/zookeeper-3.3.1-candidate-0/

 Should we release this?

 Patrick








-- 
Henry Robinson
Software Engineer
Cloudera
415-994-6679

[jira] Commented: (ZOOKEEPER-679) Offers a node design for interacting with the Java Zookeeper client.

2010-05-09 Thread Henry Robinson (JIRA)

[
https://issues.apache.org/jira/browse/ZOOKEEPER-679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12865639#action_12865639
]

Henry Robinson commented on ZOOKEEPER-679:
--

Hi Aaron -

The great thing about open source, and the relatively permissive Apache license
in particular, is that Chris is free to copy any and all of ZK into github and
continue with a development process that he finds more agreeable. It is
completely kosher to do this. As Chris says, you are welcome to contribute,
fork or ignore it.

As far as I am concerned, contrib is an excellent place to put projects that
directly add more functionality to their parent project (the language bindings
and this patch are good examples), but not a great place to store standalone
projects that simply leverage the parent (an example might be a DNS server,
written in ZooKeeper). This is a needfully vague distinction, and others will
have different opinions.

I do not know specifically to what Chris is referring when he talks about an
'onerous' patch process, but I speculate he might mean that the role of
'committer' - someone who is gating the submission of patches - makes it harder
to get your patches available for others to use quickly. Of course there are
also benefits of this approach, such as a ready collection of experienced users
on hand to offer advice and the relatively high standard for patches to be
accepted to trunk arguably improves code quality. What's great is the two
development styles are not mutually exclusive, and can, ideally, benefit from
each other. If you are having difficulties with, or are frustrated by, the
patch submission process here, ask for help. The community here is very happy
to help, and we'll do what we can to address pain points.

As for this patch, I'm happy it's going into contrib - users sometimes find
ZooKeeper difficult to program to, and examples and new abstractions are always
welcome. Keeping this patch in the main repository means that newcomers to
ZooKeeper will find it more easily. Thanks for the contribution!

Henry

Offers a node design for interacting with the Java Zookeeper client.

Key: ZOOKEEPER-679
URL: https://issues.apache.org/jira/browse/ZOOKEEPER-679
Project: Zookeeper
Issue Type: New Feature
Components: contrib, java client, tests
Reporter: Aaron Crow
Assignee: Aaron Crow
Fix For: 3.4.0

Attachments: ZOOKEEPER-679.patch, ZOOKEEPER-679.patch,
ZOOKEEPER-679.patch, ZOOKEEPER-679.patch

Following up on my conversations with Patrick and Mahadev
(http://n2.nabble.com/Might-I-contribute-a-Node-design-for-the-Java-API-td4567695.html#a4567695).
This patch includes the implementation as well as unit tests. The first unit
test gives a simple high level demo of using the node API.
The current implementation is simple and is only what I need withe current
project I am working on. However, I am very open to any and all suggestions
for improvement.
This is a proposal to support a simplified node (or File) like API into a
Zookeeper tree, by wrapping the Zookeeper Java client. It is similar to
Java's File API design.
Although, I'm trying to make it easier in a few spots. For example, deleting
a Node recursively is done by default. I also lean toward resolving
Exceptions under the hood when it seems appropriate. For example, if you
ask a Node if it exists, and its parent doesn't even exist, you just get a
false back (rather than a nasty Exception).
As for watches and ephemeral nodes, my current work does not need these
things so I currently have no handling of them. But if potential users of
the Node a.k.a. File design want these things, I'd be open to supporting
them as reasonable.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (ZOOKEEPER-769) Leader can treat observers as quorum members

2010-05-07 Thread Henry Robinson (JIRA)

[
https://issues.apache.org/jira/browse/ZOOKEEPER-769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12865240#action_12865240
]

Henry Robinson commented on ZOOKEEPER-769:
--

Sergey -

Great, thanks for making this patch! ISTR there was some reason why we didn't
infer peerType from the servers list, but I can't remember what it was...

As for your patch, a few small comments:

1. Use --no-prefix and just attach the output of git-diff (no mail headers etc)
- Hudson is rather picky about the patch formats it can apply
2. It would be great to include a test that reads a configuration and checks
that the behaviour is correct
3. If the peerTypes don't match up, should we default to the server list (on
the assumption that that will be consistent across all servers)?
4. Once you've added the patch, click 'submit patch' to start Hudson moving.

cheers,
Henry

Leader can treat observers as quorum members

Attachments: follower.log, leader.log, observer.log, warning.patch,
zoo1.cfg

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Re: Question on quorum behavior

2010-05-06 Thread Henry Robinson

Sergey -

Sounds like a bug. Can you open a new JIRA and attach your log files to it?

Thanks,
Henry

On 6 May 2010 07:50, Sergey Doroshenko dors...@gmail.com wrote:

 In short: it seems leader can treat observers as quorum members.

 Steps to repro:

 1. I have a following ensemble configuration:
 # servers list
 server.1=localhost:2881:3881
 server.2=localhost:2882:3882
 server.3=localhost:2883:3883:observer
 server.4=localhost:2884:3884
 server.5=localhost:2885:3885:observer

 2. I'm bringing up servers 1,2,3 and it's enough for quorum (1 and 2).
 3. I'm shutting down the one from the quorum who is the follower.

 As I understand, expected result is that leader will start a new election
 round so that to regain quorum.
 But the real situation is that it just says goodbye to that follower, and
 is
 still operable. (When I'm shutting down 3rd one -- observer -- leader
 starts
 trying to regain a quorum).

 Is this a bug, or a feature?


 --
 Regards, Sergey




-- 
Henry Robinson
Software Engineer
Cloudera
415-994-6679

[jira] Commented: (ZOOKEEPER-768) zkpython segfault on close (assertion error in io thread)

2010-05-06 Thread Henry Robinson (JIRA)

[
https://issues.apache.org/jira/browse/ZOOKEEPER-768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12864849#action_12864849
]

Henry Robinson commented on ZOOKEEPER-768:
--

Thanks Kapil - I'll take a look. From the stack trace it looks as though a
pending completion callback is null and therefore something weird is going on
with a completion dispatcher being freed before it is finished being used. As
per usual I can't reproduce on my machine, but this is enough information to
dig into it.

zkpython segfault on close (assertion error in io thread)
-

Key: ZOOKEEPER-768
URL: https://issues.apache.org/jira/browse/ZOOKEEPER-768
Project: Zookeeper
Issue Type: Bug
Components: contrib-bindings
Affects Versions: 3.4.0
Environment: ubuntu lucid (10.04), zookeeper trunk (java/c/zkpython)
Reporter: Kapil Thangavelu
Attachments: zkpython-segfault-client-log.txt,
zkpython-segfault-stack-traces.txt, zkpython-segfault.py

While trying to create a test case showing slow average add_auth, i stumbled
upon a test case that reliably segfaults for me, albeit with variable amount
of iterations (anwhere from 0 to 20 typically). fwiw, I've got about 220
processes in my test environment (ubuntu lucid 10.04). The test case opens a
connection, adds authentication to it, and closes the connection, in a loop.
I'm including the sample program and the gdb stack traces from the core file.
I can upload the core file if thats helpful.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (ZOOKEEPER-769) Leader can treat observers as quorum members

2010-05-06 Thread Henry Robinson (JIRA)


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12864878#action_12864878
 ] 

Henry Robinson commented on ZOOKEEPER-769:
--

Hi Sergey - 

Can you attach the logs from (at least) the leader node to this ticket? I'd 
like to figure this one out asap.

cheers,
Henry

 Leader can treat observers as quorum members
 

 Key: ZOOKEEPER-769
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-769
 Project: Zookeeper
  Issue Type: Bug
Affects Versions: 3.3.0
 Environment: Ubuntu Karmic x64
Reporter: Sergey Doroshenko
 Fix For: 3.3.0

 Attachments: zoo1.cfg


 In short: it seems leader can treat observers as quorum members.
 Steps to repro:
 1. Server configuration: 3 voters, 2 observers (attached).
 2. Bring up 2 voters and one observer. It's enough for quorum.
 3. Shut down the one from the quorum who is the follower.
 As I understand, expected result is that leader will start a new election 
 round so that to regain quorum.
 But the real situation is that it just says goodbye to that follower, and is 
 still operable. (When I'm shutting down 3rd one -- observer -- leader starts 
 trying to regain a quorum).
 (Expectedly, if on step 3 we shut down the leader, not the follower, 
 remaining follower starta new leader election, as it should be).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (ZOOKEEPER-769) Leader can treat observers as quorum members

2010-05-06 Thread Henry Robinson (JIRA)

[
https://issues.apache.org/jira/browse/ZOOKEEPER-769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12864953#action_12864953
]

Henry Robinson commented on ZOOKEEPER-769:
--

Sergey -

In the cfg files for nodes 3 and 5, did you include the following line?

peerType=observer

See http://hadoop.apache.org/zookeeper/docs/r3.3.0/zookeeperObservers.html for
details. The observer log contains this line:

2010-05-06 22:46:00,876 - INFO
[QuorumPeer:/0:0:0:0:0:0:0:0:2183:quorump...@642] - FOLLOWING

which is a big red flag because observers should never adopt the FOLLOWING
state.

If I don't have that line I can reproduce your issue. If I add it, the
observers work as expected. Can you check your cfg files?

cheers,
Henry

Leader can treat observers as quorum members

Attachments: follower.log, leader.log, observer.log, zoo1.cfg

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (ZOOKEEPER-763) Deadlock on close w/ zkpython / c client


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12864429#action_12864429
 ] 

Henry Robinson commented on ZOOKEEPER-763:
--

Hi Kapil - 

As seems to be the norm for me this week, I'm struggling to reproduce :) It 
does seem like your python script explicitly waits for a completion to be 
called before closing a handle. Is this enough to leave an outstanding 
completion on the queue?

Can you capture the stacktrace for the completion thread? I think it must be 
getting stuck in process_completions but it would be very valuable to know 
where - if it's stuck on the callback into zkpython then that means the 
deadlock is in the python bindings and not solely in C-land.

cheers,
Henry

 Deadlock on close w/ zkpython / c client
 

 Key: ZOOKEEPER-763
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-763
 Project: Zookeeper
  Issue Type: Bug
  Components: c client, contrib-bindings
Affects Versions: 3.3.0
 Environment: ubuntu 10.04, zookeeper 3.3.0 and trunk
Reporter: Kapil Thangavelu
Assignee: Mahadev konar
 Fix For: 3.4.0

 Attachments: deadlock.py, stack-trace-deadlock.txt


 deadlocks occur if we attempt to close a handle while there are any 
 outstanding async requests (aget, acreate, etc). Normally on close both the 
 io thread terminates and the completion thread are terminated and joined, 
 however w\ith outstanding async requests, the completion thread won't be in a 
 joinable state, and we effectively hang when the main thread does the join.
 afaics ideal behavior would be on close of a handle, to effectively clear out 
 any remaining callbacks and let the completion thread terminate.
 i've tried adding some bookkeeping to within a python client to guard against 
 closing while there is an outstanding async completion request, but its an 
 imperfect solution since even after the python callback is executed there is 
 still a window for deadlock before the completion thread finishes the 
 callback.
 a simple example to reproduce the deadlock is attached.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Re: Demo Code: Shared/Exclusive Lock

2010-05-05 Thread Henry Robinson

Sam -

This is great - the more contributed code the better!

Did you attach the code to your mail? The mailing lists strip out
attachments. If you wouldn't mind creating a JIRA (see
https://issues.apache.org/jira/browse/ZOOKEEPER), formatting your code as a
patch and clicking the button that says you're happy for the ASF to use your
code, that would be awesome - doing so makes it easier for us to add your
code into Apache-hosted source repositories.

Thanks again for your contribution - really pleased to see it.

cheers,
Henry

On 5 May 2010 13:06, Sam Baskinger sam.baskin...@networkedinsights.comwrote:

 All,

 It was suggested that more demo code would be welcome. I've gotten the OK
 to release a shared/exclusive Lock.java implementation we have in our test
 labs at Networked Insights. If the community would find it useful, please do
 use it! :)

 All the best, and thanks for the excellent tool,


 *Sam Baskinger
 *Software Engineer
 Networked Insights
 http://www.networkedinsights.com




-- 
Henry Robinson
Software Engineer
Cloudera
415-994-6679

[jira] Commented: (ZOOKEEPER-763) Deadlock on close w/ zkpython / c client

[
https://issues.apache.org/jira/browse/ZOOKEEPER-763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12864488#action_12864488
]

Henry Robinson commented on ZOOKEEPER-763:
--

Kapil -

Thanks! Adding that sleep helped me understand what was going on.

pyzoo_close has the GIL but blocks inside zookeeper_close, waiting for the
completion thread to finish. However, if a completion is still inside Python,
but has been pre-empted by the main thread which calls pyzoo_close, the
completion can't get the GIL back to finish up executing, blocking the
completions_thread for ever more. The fix is simple - relinquish the GIL during
the zookeeper_close call, and then reacquire it straight after. There are even
handy macros to do this:

Py_BEGIN_ALLOW_THREADS
ret = zookeeper_close(zhandles[zkhid]);
Py_END_ALLOW_THREADS

This same issue will affect any part of zkpython where a call to the C client
is blocked on some work being completed in another Python thread - in practice,
I think this means from callbacks. I'll audit the code to see if any other API
calls are affected. Patch to fix this issue is following shortly - Kapil, I'd
be very grateful if you could help us by testing it.

cheers,
Henry

Deadlock on close w/ zkpython / c client

Key: ZOOKEEPER-763
URL: https://issues.apache.org/jira/browse/ZOOKEEPER-763
Project: Zookeeper
Issue Type: Bug
Components: c client, contrib-bindings
Affects Versions: 3.3.0
Environment: ubuntu 10.04, zookeeper 3.3.0 and trunk
Reporter: Kapil Thangavelu
Assignee: Mahadev konar
Fix For: 3.4.0

Attachments: deadlock.py, deadlock_v2.py, stack-trace-deadlock.txt

deadlocks occur if we attempt to close a handle while there are any
outstanding async requests (aget, acreate, etc). Normally on close both the
io thread terminates and the completion thread are terminated and joined,
however w\ith outstanding async requests, the completion thread won't be in a
joinable state, and we effectively hang when the main thread does the join.
afaics ideal behavior would be on close of a handle, to effectively clear out
any remaining callbacks and let the completion thread terminate.
i've tried adding some bookkeeping to within a python client to guard against
closing while there is an outstanding async completion request, but its an
imperfect solution since even after the python callback is executed there is
still a window for deadlock before the completion thread finishes the
callback.
a simple example to reproduce the deadlock is attached.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (ZOOKEEPER-763) Deadlock on close w/ zkpython / c client


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Henry Robinson updated ZOOKEEPER-763:
-

 Assignee: Henry Robinson  (was: Mahadev konar)
Fix Version/s: 3.3.1
  Component/s: (was: c client)

 Deadlock on close w/ zkpython / c client
 

 Key: ZOOKEEPER-763
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-763
 Project: Zookeeper
  Issue Type: Bug
  Components: contrib-bindings
Affects Versions: 3.3.0
 Environment: ubuntu 10.04, zookeeper 3.3.0 and trunk
Reporter: Kapil Thangavelu
Assignee: Henry Robinson
 Fix For: 3.3.1, 3.4.0

 Attachments: deadlock.py, deadlock_v2.py, stack-trace-deadlock.txt


 deadlocks occur if we attempt to close a handle while there are any 
 outstanding async requests (aget, acreate, etc). Normally on close both the 
 io thread terminates and the completion thread are terminated and joined, 
 however w\ith outstanding async requests, the completion thread won't be in a 
 joinable state, and we effectively hang when the main thread does the join.
 afaics ideal behavior would be on close of a handle, to effectively clear out 
 any remaining callbacks and let the completion thread terminate.
 i've tried adding some bookkeeping to within a python client to guard against 
 closing while there is an outstanding async completion request, but its an 
 imperfect solution since even after the python callback is executed there is 
 still a window for deadlock before the completion thread finishes the 
 callback.
 a simple example to reproduce the deadlock is attached.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (ZOOKEEPER-763) Deadlock on close w/ zkpython / c client


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Henry Robinson updated ZOOKEEPER-763:
-

Status: Patch Available  (was: Open)

 Deadlock on close w/ zkpython / c client
 

 Key: ZOOKEEPER-763
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-763
 Project: Zookeeper
  Issue Type: Bug
  Components: contrib-bindings
Affects Versions: 3.3.0
 Environment: ubuntu 10.04, zookeeper 3.3.0 and trunk
Reporter: Kapil Thangavelu
Assignee: Henry Robinson
 Fix For: 3.3.1, 3.4.0

 Attachments: deadlock.py, deadlock_v2.py, stack-trace-deadlock.txt, 
 ZOOKEEPER-763.patch


 deadlocks occur if we attempt to close a handle while there are any 
 outstanding async requests (aget, acreate, etc). Normally on close both the 
 io thread terminates and the completion thread are terminated and joined, 
 however w\ith outstanding async requests, the completion thread won't be in a 
 joinable state, and we effectively hang when the main thread does the join.
 afaics ideal behavior would be on close of a handle, to effectively clear out 
 any remaining callbacks and let the completion thread terminate.
 i've tried adding some bookkeeping to within a python client to guard against 
 closing while there is an outstanding async completion request, but its an 
 imperfect solution since even after the python callback is executed there is 
 still a window for deadlock before the completion thread finishes the 
 callback.
 a simple example to reproduce the deadlock is attached.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (ZOOKEEPER-763) Deadlock on close w/ zkpython / c client


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Henry Robinson updated ZOOKEEPER-763:
-

Attachment: ZOOKEEPER-763.patch

Forgot --no-prefix again :/

 Deadlock on close w/ zkpython / c client
 

 Key: ZOOKEEPER-763
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-763
 Project: Zookeeper
  Issue Type: Bug
  Components: contrib-bindings
Affects Versions: 3.3.0
 Environment: ubuntu 10.04, zookeeper 3.3.0 and trunk
Reporter: Kapil Thangavelu
Assignee: Henry Robinson
 Fix For: 3.3.1, 3.4.0

 Attachments: deadlock.py, deadlock_v2.py, stack-trace-deadlock.txt, 
 ZOOKEEPER-763.patch, ZOOKEEPER-763.patch


 deadlocks occur if we attempt to close a handle while there are any 
 outstanding async requests (aget, acreate, etc). Normally on close both the 
 io thread terminates and the completion thread are terminated and joined, 
 however w\ith outstanding async requests, the completion thread won't be in a 
 joinable state, and we effectively hang when the main thread does the join.
 afaics ideal behavior would be on close of a handle, to effectively clear out 
 any remaining callbacks and let the completion thread terminate.
 i've tried adding some bookkeeping to within a python client to guard against 
 closing while there is an outstanding async completion request, but its an 
 imperfect solution since even after the python callback is executed there is 
 still a window for deadlock before the completion thread finishes the 
 callback.
 a simple example to reproduce the deadlock is attached.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (ZOOKEEPER-763) Deadlock on close w/ zkpython / c client


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Henry Robinson updated ZOOKEEPER-763:
-

Status: Patch Available  (was: Open)

 Deadlock on close w/ zkpython / c client
 

 Key: ZOOKEEPER-763
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-763
 Project: Zookeeper
  Issue Type: Bug
  Components: contrib-bindings
Affects Versions: 3.3.0
 Environment: ubuntu 10.04, zookeeper 3.3.0 and trunk
Reporter: Kapil Thangavelu
Assignee: Henry Robinson
 Fix For: 3.3.1, 3.4.0

 Attachments: deadlock.py, deadlock_v2.py, stack-trace-deadlock.txt, 
 ZOOKEEPER-763.patch, ZOOKEEPER-763.patch


 deadlocks occur if we attempt to close a handle while there are any 
 outstanding async requests (aget, acreate, etc). Normally on close both the 
 io thread terminates and the completion thread are terminated and joined, 
 however w\ith outstanding async requests, the completion thread won't be in a 
 joinable state, and we effectively hang when the main thread does the join.
 afaics ideal behavior would be on close of a handle, to effectively clear out 
 any remaining callbacks and let the completion thread terminate.
 i've tried adding some bookkeeping to within a python client to guard against 
 closing while there is an outstanding async completion request, but its an 
 imperfect solution since even after the python callback is executed there is 
 still a window for deadlock before the completion thread finishes the 
 callback.
 a simple example to reproduce the deadlock is attached.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (ZOOKEEPER-763) Deadlock on close w/ zkpython / c client


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Henry Robinson updated ZOOKEEPER-763:
-

Status: Open  (was: Patch Available)

 Deadlock on close w/ zkpython / c client
 

 Key: ZOOKEEPER-763
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-763
 Project: Zookeeper
  Issue Type: Bug
  Components: contrib-bindings
Affects Versions: 3.3.0
 Environment: ubuntu 10.04, zookeeper 3.3.0 and trunk
Reporter: Kapil Thangavelu
Assignee: Henry Robinson
 Fix For: 3.3.1, 3.4.0

 Attachments: deadlock.py, deadlock_v2.py, stack-trace-deadlock.txt, 
 ZOOKEEPER-763.patch, ZOOKEEPER-763.patch


 deadlocks occur if we attempt to close a handle while there are any 
 outstanding async requests (aget, acreate, etc). Normally on close both the 
 io thread terminates and the completion thread are terminated and joined, 
 however w\ith outstanding async requests, the completion thread won't be in a 
 joinable state, and we effectively hang when the main thread does the join.
 afaics ideal behavior would be on close of a handle, to effectively clear out 
 any remaining callbacks and let the completion thread terminate.
 i've tried adding some bookkeeping to within a python client to guard against 
 closing while there is an outstanding async completion request, but its an 
 imperfect solution since even after the python callback is executed there is 
 still a window for deadlock before the completion thread finishes the 
 callback.
 a simple example to reproduce the deadlock is attached.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (ZOOKEEPER-764) Observer elected leader due to inconsistent voting view


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-764?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Henry Robinson updated ZOOKEEPER-764:
-

Attachment: ZOOKEEPER-764_3_3_1.patch

Patch to apply against 3_3_1

 Observer elected leader due to inconsistent voting view
 ---

 Key: ZOOKEEPER-764
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-764
 Project: Zookeeper
  Issue Type: Bug
  Components: quorum
Reporter: Flavio Paiva Junqueira
Assignee: Henry Robinson
 Fix For: 3.3.1, 3.4.0

 Attachments: ZOOKEEPER-690.patch, ZOOKEEPER-764_3_3_1.patch


 In ZOOKEEPER-690, we noticed that an observer was being elected, and Henry 
 proposed a patch to fix the issue. However, it seems that the patch does not 
 solve the issue one user (Alan Cabrera) has observed. Given that we would 
 like to fix this issue, and to work separately with Alan to determine the 
 problem with his setup, I'm creating this jira and re-posting Henry's patch.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (ZOOKEEPER-690) AsyncTestHammer test fails on hudson.

2010-05-04 Thread Henry Robinson (JIRA)

[
https://issues.apache.org/jira/browse/ZOOKEEPER-690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12863902#action_12863902
]

Henry Robinson commented on ZOOKEEPER-690:
--

Hi Alan -

Looking at this attachment: nohup-AsyncHammerTest-201004301209.txt - the tests
appear to be run twice. The first testObserversHammer completes successfully,
the second fails. Were you running the tests until you experienced the failure?

Henry

AsyncTestHammer test fails on hudson.
-

Key: ZOOKEEPER-690
URL: https://issues.apache.org/jira/browse/ZOOKEEPER-690
Project: Zookeeper
Issue Type: Bug
Reporter: Mahadev konar
Assignee: Henry Robinson
Priority: Blocker
Fix For: 3.3.1, 3.4.0

Attachments: jstack-201004201053.txt, jstack-201004291409.txt,
jstack-201004291527.txt, jstack-AsyncHammerTest-201004301209.txt,
nohup-201004201053.txt, nohup-201004291409.txt, nohup-201004291527.txt,
nohup-AsyncHammerTest-201004301209.txt,
nohup-QuorumPeerMainTest-201004301209.txt,
TEST-org.apache.zookeeper.test.AsyncHammerTest.txt, zoo.log,
ZOOKEEPER-690.patch, ZOOKEEPER-690.patch, ZOOKEEPER-690.patch

the hudson test failed on
http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h1.grid.sp2.yahoo.net/2/testReport/.
There are huge set of cancelledkeyexceptions in the logs. Still going
through the logs to find out the reason for failure.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (ZOOKEEPER-690) AsyncTestHammer test fails on hudson.

2010-05-04 Thread Henry Robinson (JIRA)


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12863915#action_12863915
 ] 

Henry Robinson commented on ZOOKEEPER-690:
--

Weird - it looks like the test is shutting down correctly:


[junit] 2010-04-30 11:41:52,896 - INFO  [main:clientb...@222] - connecting to 
127.0.0.1 11233
[junit] 2010-04-30 11:41:52,896 - INFO  [main:quorumb...@277] - 
127.0.0.1:11233 is no longer accepting client connections
[junit] 2010-04-30 11:41:52,896 - INFO  [main:clientb...@222] - connecting 
to 127.0.0.1 11234
[junit] 2010-04-30 11:41:52,897 - INFO  [main:quorumb...@277] - 
127.0.0.1:11234 is no longer accepting client connections
[junit] 2010-04-30 11:41:52,897 - INFO  [main:clientb...@222] - connecting 
to 127.0.0.1 11235
[junit] 2010-04-30 11:41:52,897 - INFO  [main:quorumb...@277] - 
127.0.0.1:11235 is no longer accepting client connections
[junit] 2010-04-30 11:41:52,897 - INFO  [main:clientb...@222] - connecting 
to 127.0.0.1 11236
[junit] 2010-04-30 11:41:52,898 - INFO  [main:quorumb...@277] - 
127.0.0.1:11236 is no longer accepting client connections
[junit] 2010-04-30 11:41:52,898 - INFO  [main:clientb...@222] - connecting 
to 127.0.0.1 11237
[junit] 2010-04-30 11:41:52,898 - INFO  [main:quorumb...@277] - 
127.0.0.1:11237 is no longer accepting client connections
[junit] 2010-04-30 11:41:52,901 - INFO  
[main:junit4zktestrunner$loggedinvokemet...@56] - FINISHED TEST METHOD 
testObserversHammer
[junit] 2010-04-30 11:41:52,901 - INFO  [main:zktestcas...@59] - SUCCEEDED 
testObserversHammer
[junit] 2010-04-30 11:41:52,901 - INFO  [main:zktestcas...@54] - FINISHED 
testObserversHammer

and then it goes into trying the C tests which fail for an unrelated reason - 
does it lock up at this point or does it actually fail out to the CLI? If it 
locks up, is the jstack output you attached from that run?



 AsyncTestHammer test fails on hudson.
 -

 Key: ZOOKEEPER-690
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-690
 Project: Zookeeper
  Issue Type: Bug
Reporter: Mahadev konar
Assignee: Henry Robinson
Priority: Blocker
 Fix For: 3.3.1, 3.4.0

 Attachments: jstack-201004201053.txt, jstack-201004291409.txt, 
 jstack-201004291527.txt, jstack-AsyncHammerTest-201004301209.txt, 
 nohup-201004201053.txt, nohup-201004291409.txt, nohup-201004291527.txt, 
 nohup-AsyncHammerTest-201004301209.txt, 
 nohup-QuorumPeerMainTest-201004301209.txt, 
 TEST-org.apache.zookeeper.test.AsyncHammerTest.txt, zoo.log, 
 ZOOKEEPER-690.patch, ZOOKEEPER-690.patch, ZOOKEEPER-690.patch


 the hudson test failed on 
 http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h1.grid.sp2.yahoo.net/2/testReport/.
  There are huge set of cancelledkeyexceptions in the logs. Still going 
 through the logs to find out the reason for failure.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Re: ZOOKEEPER-107 - Allow dynamic changes to server cluster membership

2010-05-03 Thread Henry Robinson

Hi Vishal -

Great that you're interested in contributing! This would be a really neat
feature to get into ZK.

The documentation that exists is essentially all on the JIRA. I had a patch
that 'worked' but was nowhere near commit-ready. I'm trying to dig it up,
but it appears it may have gone to the great bit-bucket in the sky. Trunk
has moved sufficiently that a new patch would be required anyhow.

There were two main difficulties with this issue. The first is changing the
voting protocol to cope with changes in views. Since proposals are
pipelined, the leader needs to keep track of what the view was that should
vote for a proposal. IIRC, the other subtlety is making sure that when a
view change is proposed, a quorum of votes is received from both the
outgoing view and the incoming one. Otherwise it's possible to transition to
a 'dead' view in which no progress can be made.

The second is to figure out the metadata management - how do we 'find'
ZooKeeper servers if the ensemble may have moved onto a completely separate
set of machines? That is, if the original ensemble was on A, B, C and the
current ensemble is D, E, F - where do we look to find where the ensemble is
located?

The first is a solved issue, the second is more a matter of taste than
designing distributed protocols.

Really happy to help with this issue - I'd love to see it get resurrected.

cheers,
Henry

On 3 May 2010 07:25, Vishal K vishalm...@gmail.com wrote:

 Hi Henry,

 I just commented on the Jira. I would be happy to contribute.
 Please advise on the current status and next steps. Thanks.

 Regards,
 -Vishal




-- 
Henry Robinson
Software Engineer
Cloudera
415-994-6679

Re: ZOOKEEPER-107 - Allow dynamic changes to server cluster membership

2010-05-03 Thread Henry Robinson

Hi Vishal -

That's right - design, not implementation!

I'd encourage you to share a design document once you feel you understand
exactly what's required. This is probably going to be complex patch and
reviewers will need a study guide :)

cheers,
Henry

On 3 May 2010 10:26, Vishal Kher vishalm...@gmail.com wrote:

 Hi Henry,

 Thanks for the info. I will spend some more time to understand the issues
 before starting with the implementation. I will let you know if I have any
 questions (which I am sure I will).

 Just to clarify, by solved issue you mean from design perspective and not
 from implementation right?
 Regards,
 -Vishal
 On Mon, May 3, 2010 at 1:16 PM, Henry Robinson he...@cloudera.com wrote:

  Hi Vishal -
 
  Great that you're interested in contributing! This would be a really neat
  feature to get into ZK.
 
  The documentation that exists is essentially all on the JIRA. I had a
 patch
  that 'worked' but was nowhere near commit-ready. I'm trying to dig it up,
  but it appears it may have gone to the great bit-bucket in the sky. Trunk
  has moved sufficiently that a new patch would be required anyhow.
 
  There were two main difficulties with this issue. The first is changing
 the
  voting protocol to cope with changes in views. Since proposals are
  pipelined, the leader needs to keep track of what the view was that
 should
  vote for a proposal. IIRC, the other subtlety is making sure that when a
  view change is proposed, a quorum of votes is received from both the
  outgoing view and the incoming one. Otherwise it's possible to transition
  to
  a 'dead' view in which no progress can be made.
 
  The second is to figure out the metadata management - how do we 'find'
  ZooKeeper servers if the ensemble may have moved onto a completely
 separate
  set of machines? That is, if the original ensemble was on A, B, C and the
  current ensemble is D, E, F - where do we look to find where the ensemble
  is
  located?
 
  The first is a solved issue, the second is more a matter of taste than
  designing distributed protocols.
 
  Really happy to help with this issue - I'd love to see it get
 resurrected.
 
  cheers,
  Henry
 
  On 3 May 2010 07:25, Vishal K vishalm...@gmail.com wrote:
 
   Hi Henry,
  
   I just commented on the Jira. I would be happy to contribute.
   Please advise on the current status and next steps. Thanks.
  
   Regards,
   -Vishal
  
 
 
 
  --
  Henry Robinson
  Software Engineer
  Cloudera
  415-994-6679
 




-- 
Henry Robinson
Software Engineer
Cloudera
415-994-6679

Re: Dynamic adding/removing ZK servers on client

2010-05-03 Thread Henry Robinson

On 3 May 2010 16:40, Dave Wright wrig...@gmail.com wrote:

  Should this be a znode in the privileged namespace?
 

 I think having a znode for the current cluster members is part of the
 ZOOKEEPER-107 proposal, with the idea being that you could get/set the
 membership just by writing to that node. On the client side, you could
 watch that znode and update your server list when it changes.



This is tricky: what happens if the server your client is connected to is
decommissioned by a view change, and you are unable to locate another server
to connect to because other view changes committed while you are
reconnecting have removed all the servers you knew about. We'd need to make
sure that watches on this znode were fired before a view change, but it's
hard to know how to avoid having to wait for a session timeout before a
client that might just be migrating servers reappears in order to make sure
it sees the veiw change.

Even then, the problem of 'locating' the cluster still exists in the case
that there are no clients connected to tell anyone about it.

Henry


-- 
Henry Robinson
Software Engineer
Cloudera
415-994-6679

[jira] Updated: (ZOOKEEPER-758) zkpython segfaults on invalid acl with missing key

[
https://issues.apache.org/jira/browse/ZOOKEEPER-758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Henry Robinson updated ZOOKEEPER-758:
-

Attachment: ZOOKEEPER-758.patch

Kapil -

Thanks for the patch! Unfortunately it didn't apply cleanly against trunk
because I think you had added 'test_acl_validity' to acl_test.py which was not
included in the diff.

I'm attaching a patch that applies cleanly to trunk - no code changes from your
patch.

Thanks,

Henry

zkpython segfaults on invalid acl with missing key
--

Key: ZOOKEEPER-758
URL: https://issues.apache.org/jira/browse/ZOOKEEPER-758
Project: Zookeeper
Issue Type: Bug
Components: contrib-bindings
Affects Versions: 3.3.0, 3.4.0
Environment: ubuntu lucid (10.04)
Reporter: Kapil Thangavelu
Attachments: invalid-acl-fix-and-test.diff, ZOOKEEPER-758.patch

Currently when setting an acl, there is a minimal parse to ensure that its a
list of dicts, however if one of the dicts is missing a required key, the
subsequent usage doesn't check for it, and will segfault.. for example using
an acl of [{schema:id, id:world, permissions:PERM_ALL}] will segfault if
used, because the scheme key is missing (its been purposefully typo'd to
schema in example).
I've expanded the check_acl macro to include verifying that all keys are
present and added some unit tests against trunk in the attachments.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (ZOOKEEPER-758) zkpython segfaults on invalid acl with missing key


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Henry Robinson updated ZOOKEEPER-758:
-

  Status: Patch Available  (was: Open)
Hadoop Flags: [Reviewed]

I have reviewed this, and it looks good. Thanks Kapil!

 zkpython segfaults on invalid acl with missing key
 --

 Key: ZOOKEEPER-758
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-758
 Project: Zookeeper
  Issue Type: Bug
  Components: contrib-bindings
Affects Versions: 3.3.0, 3.4.0
 Environment: ubuntu lucid (10.04)
Reporter: Kapil Thangavelu
 Attachments: invalid-acl-fix-and-test.diff, ZOOKEEPER-758.patch


 Currently when setting an acl, there is a minimal parse to ensure that its a 
 list of dicts, however if one of the dicts is missing a required key, the 
 subsequent usage doesn't check for it, and will segfault.. for example using 
 an acl of [{schema:id, id:world, permissions:PERM_ALL}] will segfault if 
 used, because the scheme key is missing (its been purposefully typo'd to 
 schema in example). 
 I've expanded the check_acl macro to include verifying that all keys are 
 present and added some unit tests against trunk in the attachments.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (ZOOKEEPER-758) zkpython segfaults on invalid acl with missing key


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Henry Robinson updated ZOOKEEPER-758:
-

Attachment: ZOOKEEPER-758.patch

forgot --no-prefix.

 zkpython segfaults on invalid acl with missing key
 --

 Key: ZOOKEEPER-758
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-758
 Project: Zookeeper
  Issue Type: Bug
  Components: contrib-bindings
Affects Versions: 3.3.0, 3.4.0
 Environment: ubuntu lucid (10.04)
Reporter: Kapil Thangavelu
 Attachments: invalid-acl-fix-and-test.diff, ZOOKEEPER-758.patch, 
 ZOOKEEPER-758.patch


 Currently when setting an acl, there is a minimal parse to ensure that its a 
 list of dicts, however if one of the dicts is missing a required key, the 
 subsequent usage doesn't check for it, and will segfault.. for example using 
 an acl of [{schema:id, id:world, permissions:PERM_ALL}] will segfault if 
 used, because the scheme key is missing (its been purposefully typo'd to 
 schema in example). 
 I've expanded the check_acl macro to include verifying that all keys are 
 present and added some unit tests against trunk in the attachments.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (ZOOKEEPER-758) zkpython segfaults on invalid acl with missing key


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Henry Robinson updated ZOOKEEPER-758:
-

Status: Patch Available  (was: Open)

 zkpython segfaults on invalid acl with missing key
 --

 Key: ZOOKEEPER-758
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-758
 Project: Zookeeper
  Issue Type: Bug
  Components: contrib-bindings
Affects Versions: 3.3.0, 3.4.0
 Environment: ubuntu lucid (10.04)
Reporter: Kapil Thangavelu
 Attachments: invalid-acl-fix-and-test.diff, ZOOKEEPER-758.patch, 
 ZOOKEEPER-758.patch


 Currently when setting an acl, there is a minimal parse to ensure that its a 
 list of dicts, however if one of the dicts is missing a required key, the 
 subsequent usage doesn't check for it, and will segfault.. for example using 
 an acl of [{schema:id, id:world, permissions:PERM_ALL}] will segfault if 
 used, because the scheme key is missing (its been purposefully typo'd to 
 schema in example). 
 I've expanded the check_acl macro to include verifying that all keys are 
 present and added some unit tests against trunk in the attachments.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (ZOOKEEPER-758) zkpython segfaults on invalid acl with missing key


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Henry Robinson updated ZOOKEEPER-758:
-

Status: Open  (was: Patch Available)

 zkpython segfaults on invalid acl with missing key
 --

 Key: ZOOKEEPER-758
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-758
 Project: Zookeeper
  Issue Type: Bug
  Components: contrib-bindings
Affects Versions: 3.3.0, 3.4.0
 Environment: ubuntu lucid (10.04)
Reporter: Kapil Thangavelu
 Attachments: invalid-acl-fix-and-test.diff, ZOOKEEPER-758.patch, 
 ZOOKEEPER-758.patch


 Currently when setting an acl, there is a minimal parse to ensure that its a 
 list of dicts, however if one of the dicts is missing a required key, the 
 subsequent usage doesn't check for it, and will segfault.. for example using 
 an acl of [{schema:id, id:world, permissions:PERM_ALL}] will segfault if 
 used, because the scheme key is missing (its been purposefully typo'd to 
 schema in example). 
 I've expanded the check_acl macro to include verifying that all keys are 
 present and added some unit tests against trunk in the attachments.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (ZOOKEEPER-758) zkpython segfaults on invalid acl with missing key


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Henry Robinson updated ZOOKEEPER-758:
-

   Status: Resolved  (was: Patch Available)
Fix Version/s: 3.3.1
   3.4.0
   Resolution: Fixed

I just committed this. Thanks Kapil!

 zkpython segfaults on invalid acl with missing key
 --

 Key: ZOOKEEPER-758
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-758
 Project: Zookeeper
  Issue Type: Bug
  Components: contrib-bindings
Affects Versions: 3.3.0, 3.4.0
 Environment: ubuntu lucid (10.04)
Reporter: Kapil Thangavelu
 Fix For: 3.3.1, 3.4.0

 Attachments: invalid-acl-fix-and-test.diff, ZOOKEEPER-758.patch, 
 ZOOKEEPER-758.patch


 Currently when setting an acl, there is a minimal parse to ensure that its a 
 list of dicts, however if one of the dicts is missing a required key, the 
 subsequent usage doesn't check for it, and will segfault.. for example using 
 an acl of [{schema:id, id:world, permissions:PERM_ALL}] will segfault if 
 used, because the scheme key is missing (its been purposefully typo'd to 
 schema in example). 
 I've expanded the check_acl macro to include verifying that all keys are 
 present and added some unit tests against trunk in the attachments.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (ZOOKEEPER-690) AsyncTestHammer test fails on hudson.

[
https://issues.apache.org/jira/browse/ZOOKEEPER-690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Henry Robinson updated ZOOKEEPER-690:
-

Attachment: ZOOKEEPER-690.patch

I have found what I hope is the problem.

Because QuorumPeers duplicate their 'LearnerType' in two places there's the
possibility that they may get out of sync. This is what was happening here - it
was a test bug. Although the Observers knew that they were Observers, the other
nodes did not. This affected the leader election protocol as other node did not
know to reject an Observer.

I feel like we should refactor the QuorumPeer.QuorumServer code so as not to
duplicate information, but for the time being I think this patch will work.

I have also taken the opportunity to standardise the naming of 'learnertype'
throughout the code (in some places it was called 'peertype' adding to the
confusion).

Tests pass on my machine, but I can't guarantee that the problem is fixed as I
could never recreate the error.

Thanks to Flavio for catching the broken invariant!

AsyncTestHammer test fails on hudson.
-

Key: ZOOKEEPER-690
URL: https://issues.apache.org/jira/browse/ZOOKEEPER-690
Project: Zookeeper
Issue Type: Bug
Reporter: Mahadev konar
Assignee: Henry Robinson
Priority: Blocker
Fix For: 3.3.1, 3.4.0

Attachments: jstack-201004201053.txt, nohup-201004201053.txt,
TEST-org.apache.zookeeper.test.AsyncHammerTest.txt, zoo.log,
ZOOKEEPER-690.patch

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (ZOOKEEPER-690) AsyncTestHammer test fails on hudson.


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12862351#action_12862351
 ] 

Henry Robinson commented on ZOOKEEPER-690:
--

Alan - can you try this patch to see if it fixes things? 

Thanks, 

Henry


 AsyncTestHammer test fails on hudson.
 -

 Key: ZOOKEEPER-690
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-690
 Project: Zookeeper
  Issue Type: Bug
Reporter: Mahadev konar
Assignee: Henry Robinson
Priority: Blocker
 Fix For: 3.3.1, 3.4.0

 Attachments: jstack-201004201053.txt, nohup-201004201053.txt, 
 TEST-org.apache.zookeeper.test.AsyncHammerTest.txt, zoo.log, 
 ZOOKEEPER-690.patch


 the hudson test failed on 
 http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h1.grid.sp2.yahoo.net/2/testReport/.
  There are huge set of cancelledkeyexceptions in the logs. Still going 
 through the logs to find out the reason for failure.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (ZOOKEEPER-690) AsyncTestHammer test fails on hudson.


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Henry Robinson updated ZOOKEEPER-690:
-

Status: Patch Available  (was: Open)

 AsyncTestHammer test fails on hudson.
 -

 Key: ZOOKEEPER-690
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-690
 Project: Zookeeper
  Issue Type: Bug
Reporter: Mahadev konar
Assignee: Henry Robinson
Priority: Blocker
 Fix For: 3.3.1, 3.4.0

 Attachments: jstack-201004201053.txt, nohup-201004201053.txt, 
 TEST-org.apache.zookeeper.test.AsyncHammerTest.txt, zoo.log, 
 ZOOKEEPER-690.patch


 the hudson test failed on 
 http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h1.grid.sp2.yahoo.net/2/testReport/.
  There are huge set of cancelledkeyexceptions in the logs. Still going 
 through the logs to find out the reason for failure.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (ZOOKEEPER-690) AsyncTestHammer test fails on hudson.


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12862424#action_12862424
 ] 

Henry Robinson commented on ZOOKEEPER-690:
--

This map is, I think, shared between the quorumpeers for the purposes of the 
test (and in general there aren't two quorumpeers sharing this datastructure 
when running normally). 

But! The error here is that I'm dumb (and that Java's type-checking leaves a 
little to be desired). I've written quorumPeers.containsValue up there, but 
actually it should be quorumPeers.containsKey. New patch on the way, let's see 
if that fixes it.

 AsyncTestHammer test fails on hudson.
 -

 Key: ZOOKEEPER-690
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-690
 Project: Zookeeper
  Issue Type: Bug
Reporter: Mahadev konar
Assignee: Henry Robinson
Priority: Blocker
 Fix For: 3.3.1, 3.4.0

 Attachments: jstack-201004201053.txt, jstack-201004291409.txt, 
 nohup-201004201053.txt, nohup-201004291409.txt, 
 TEST-org.apache.zookeeper.test.AsyncHammerTest.txt, zoo.log, 
 ZOOKEEPER-690.patch


 the hudson test failed on 
 http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h1.grid.sp2.yahoo.net/2/testReport/.
  There are huge set of cancelledkeyexceptions in the logs. Still going 
 through the logs to find out the reason for failure.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (ZOOKEEPER-690) AsyncTestHammer test fails on hudson.


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Henry Robinson updated ZOOKEEPER-690:
-

Attachment: ZOOKEEPER-690.patch

Alan - would you mind trying this new patch? Thanks for your patience. I 
suspect that something might still be a bit flaky with these tests (not the 
code, but the tests), but I hope this will fix this particular problem. 

 AsyncTestHammer test fails on hudson.
 -

 Key: ZOOKEEPER-690
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-690
 Project: Zookeeper
  Issue Type: Bug
Reporter: Mahadev konar
Assignee: Henry Robinson
Priority: Blocker
 Fix For: 3.3.1, 3.4.0

 Attachments: jstack-201004201053.txt, jstack-201004291409.txt, 
 nohup-201004201053.txt, nohup-201004291409.txt, 
 TEST-org.apache.zookeeper.test.AsyncHammerTest.txt, zoo.log, 
 ZOOKEEPER-690.patch, ZOOKEEPER-690.patch


 the hudson test failed on 
 http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h1.grid.sp2.yahoo.net/2/testReport/.
  There are huge set of cancelledkeyexceptions in the logs. Still going 
 through the logs to find out the reason for failure.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (ZOOKEEPER-690) AsyncTestHammer test fails on hudson.

[
https://issues.apache.org/jira/browse/ZOOKEEPER-690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12862482#action_12862482
]

Henry Robinson commented on ZOOKEEPER-690:
--

Ben -

Agreed. I see this as the same as setMyid(...) - it sets an immutable value and
should only be called once. I'd prefer if these parameters were 'final' in
QuorumPeer and set in the constructor, but that's not the way that
runFromConfig (the only place outside of tests that these methods are called)
is written. Then we could get rid of setLearnerType, for sure.

The real error here, I think, is duplicating the learnertype between QuorumPeer
and QuorumServer. If we are going to have the list of QuorumServers, then
getLearnerType should lookup the learner type in the peer map. Same for the
serverid, perhaps, and we should just save a reference to the QuorumServer that
represents our Quorumpeer.

AsyncTestHammer test fails on hudson.
-

Key: ZOOKEEPER-690
URL: https://issues.apache.org/jira/browse/ZOOKEEPER-690
Project: Zookeeper
Issue Type: Bug
Reporter: Mahadev konar
Assignee: Henry Robinson
Priority: Blocker
Fix For: 3.3.1, 3.4.0

Attachments: jstack-201004201053.txt, jstack-201004291409.txt,
jstack-201004291527.txt, nohup-201004201053.txt, nohup-201004291409.txt,
nohup-201004291527.txt, TEST-org.apache.zookeeper.test.AsyncHammerTest.txt,
zoo.log, ZOOKEEPER-690.patch, ZOOKEEPER-690.patch, ZOOKEEPER-690.patch

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (ZOOKEEPER-690) AsyncTestHammer test fails on hudson.


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12861865#action_12861865
 ] 

Henry Robinson commented on ZOOKEEPER-690:
--

Progress update - possibly to do with a bug in FLE allowing an Observer to be 
elected. We're looking into this now.

 AsyncTestHammer test fails on hudson.
 -

 Key: ZOOKEEPER-690
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-690
 Project: Zookeeper
  Issue Type: Bug
Reporter: Mahadev konar
Assignee: Henry Robinson
Priority: Blocker
 Fix For: 3.3.1, 3.4.0

 Attachments: jstack-201004201053.txt, nohup-201004201053.txt, 
 TEST-org.apache.zookeeper.test.AsyncHammerTest.txt, zoo.log


 the hudson test failed on 
 http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h1.grid.sp2.yahoo.net/2/testReport/.
  There are huge set of cancelledkeyexceptions in the logs. Still going 
 through the logs to find out the reason for failure.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (ZOOKEEPER-749) OSGi metadata not included in binary only jar


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Henry Robinson updated ZOOKEEPER-749:
-

Hadoop Flags: [Reviewed]

+1, patch looks good to me. Tests failing was a quirk of Hudson, as this patch 
doesn't test code. ant bin-jar works correctly. 

 OSGi metadata not included in binary only jar
 -

 Key: ZOOKEEPER-749
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-749
 Project: Zookeeper
  Issue Type: Bug
  Components: build
Affects Versions: 3.3.0
Reporter: Patrick Hunt
Assignee: Patrick Hunt
Priority: Critical
 Fix For: 3.3.1, 3.4.0

 Attachments: ZOOKEEPER-749.patch


 See this JIRA/comment for background:
 https://issues.apache.org/jira/browse/ZOOKEEPER-425?focusedCommentId=12859697page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12859697
 basically the issue is that OSGi metadata is included in the legacy jar 
 (zookeeper-version.jar) but not in the binary only
 jar (zookeeper-version-bin.jar) which is eventually deployed to the maven 
 repo.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (ZOOKEEPER-749) OSGi metadata not included in binary only jar


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Henry Robinson updated ZOOKEEPER-749:
-

Status: Resolved  (was: Patch Available)
Resolution: Fixed

I just committed this. Thanks Patrick!

 OSGi metadata not included in binary only jar
 -

 Key: ZOOKEEPER-749
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-749
 Project: Zookeeper
  Issue Type: Bug
  Components: build
Affects Versions: 3.3.0
Reporter: Patrick Hunt
Assignee: Patrick Hunt
Priority: Critical
 Fix For: 3.3.1, 3.4.0

 Attachments: ZOOKEEPER-749.patch


 See this JIRA/comment for background:
 https://issues.apache.org/jira/browse/ZOOKEEPER-425?focusedCommentId=12859697page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12859697
 basically the issue is that OSGi metadata is included in the legacy jar 
 (zookeeper-version.jar) but not in the binary only
 jar (zookeeper-version-bin.jar) which is eventually deployed to the maven 
 repo.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Resolved: (ZOOKEEPER-750) move maven artifacts into dist-maven subdir of the release (package target)