[jira] [Commented] (ZOOKEEPER-1506) Re-try DNS hostname -> IP resolution if node connection fails

2015-04-20 Thread Raul Gutierrez Segales (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14503846#comment-14503846
 ] 

Raul Gutierrez Segales commented on ZOOKEEPER-1506:
---

It doesn't. I got a consistent repro by first firewalling the participant with 
id 0, to force that code path.

I'll try reverting the patch entirely and see if that helps.

> Re-try DNS hostname -> IP resolution if node connection fails
> -
>
> Key: ZOOKEEPER-1506
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1506
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: server
>Affects Versions: 3.4.5
> Environment: Ubuntu 11.04 64-bit
>Reporter: Mike Heffner
>Assignee: Michi Mutsuzaki
>Priority: Critical
>  Labels: patch
> Fix For: 3.4.7, 3.5.1, 3.6.0
>
> Attachments: ZOOKEEPER-1506-fix.patch, ZOOKEEPER-1506.patch, 
> ZOOKEEPER-1506.patch, ZOOKEEPER-1506.patch, ZOOKEEPER-1506.patch, 
> ZOOKEEPER-1506.patch, ZOOKEEPER-1506.patch, ZOOKEEPER-1506.patch, 
> zk-dns-caching-refresh.patch
>
>
> In our zoo.cfg we use hostnames to identify the ZK servers that are part of 
> an ensemble. These hostnames are configured with a low (<= 60s) TTL and the 
> IP address they map to can and does change. Our procedure for 
> replacing/upgrading a ZK node is to boot an entirely new instance and remap 
> the hostname to the new instance's IP address. Our expectation is that when 
> the original ZK node is terminated/shutdown, the remaining nodes in the 
> ensemble would reconnect to the new instance.
> However, what we are noticing is that the remaining ZK nodes do not attempt 
> to re-resolve the hostname->IP mapping for the new server. Once the original 
> ZK node is terminated, the existing servers continue to attempt contacting it 
> at the old IP address. It would be great if the ZK servers could try to 
> re-resolve the hostname when attempting to connect to a lost ZK server, 
> instead of caching the lookup indefinitely. Currently we must do a rolling 
> restart of the ZK ensemble after swapping a node -- which at three nodes 
> means we periodically lose quorum.
> The exact method we are following is to boot new instances in EC2 and attach 
> one, of a set of three, Elastic IP address. External to EC2 this IP address 
> remains the same and maps to whatever instance it is attached to. Internal to 
> EC2, the elastic IP hostname has a TTL of about 45-60 seconds and is remapped 
> to the internal (10.x.y.z) address of the instance it is attached to. 
> Therefore, in our case we would like ZK to pickup the new 10.x.y.z address 
> that the elastic IP hostname gets mapped to and reconnect appropriately.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-1506) Re-try DNS hostname -> IP resolution if node connection fails

2015-04-20 Thread Raul Gutierrez Segales (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14504007#comment-14504007
 ] 

Raul Gutierrez Segales commented on ZOOKEEPER-1506:
---

I'll go ahead and close this again [~michim].

> Re-try DNS hostname -> IP resolution if node connection fails
> -
>
> Key: ZOOKEEPER-1506
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1506
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: server
>Affects Versions: 3.4.5
> Environment: Ubuntu 11.04 64-bit
>Reporter: Mike Heffner
>Assignee: Michi Mutsuzaki
>Priority: Critical
>  Labels: patch
> Fix For: 3.4.7, 3.5.1, 3.6.0
>
> Attachments: ZOOKEEPER-1506-fix.patch, ZOOKEEPER-1506.patch, 
> ZOOKEEPER-1506.patch, ZOOKEEPER-1506.patch, ZOOKEEPER-1506.patch, 
> ZOOKEEPER-1506.patch, ZOOKEEPER-1506.patch, ZOOKEEPER-1506.patch, 
> zk-dns-caching-refresh.patch
>
>
> In our zoo.cfg we use hostnames to identify the ZK servers that are part of 
> an ensemble. These hostnames are configured with a low (<= 60s) TTL and the 
> IP address they map to can and does change. Our procedure for 
> replacing/upgrading a ZK node is to boot an entirely new instance and remap 
> the hostname to the new instance's IP address. Our expectation is that when 
> the original ZK node is terminated/shutdown, the remaining nodes in the 
> ensemble would reconnect to the new instance.
> However, what we are noticing is that the remaining ZK nodes do not attempt 
> to re-resolve the hostname->IP mapping for the new server. Once the original 
> ZK node is terminated, the existing servers continue to attempt contacting it 
> at the old IP address. It would be great if the ZK servers could try to 
> re-resolve the hostname when attempting to connect to a lost ZK server, 
> instead of caching the lookup indefinitely. Currently we must do a rolling 
> restart of the ZK ensemble after swapping a node -- which at three nodes 
> means we periodically lose quorum.
> The exact method we are following is to boot new instances in EC2 and attach 
> one, of a set of three, Elastic IP address. External to EC2 this IP address 
> remains the same and maps to whatever instance it is attached to. Internal to 
> EC2, the elastic IP hostname has a TTL of about 45-60 seconds and is remapped 
> to the internal (10.x.y.z) address of the instance it is attached to. 
> Therefore, in our case we would like ZK to pickup the new 10.x.y.z address 
> that the elastic IP hostname gets mapped to and reconnect appropriately.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-1506) Re-try DNS hostname -> IP resolution if node connection fails

2015-04-20 Thread Raul Gutierrez Segales (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14504004#comment-14504004
 ] 

Raul Gutierrez Segales commented on ZOOKEEPER-1506:
---

So I created a build with ZOOKEEPER-1506 removed and I still get the problem.

It's probably due the getHostName() calls that you pointed out. These calls can 
actually generate reverse lookups according to:

http://download.java.net/jdk7/archive/b123/docs/api/java/net/InetSocketAddress.html#getHostName%28%29

However, these calls have been introduced by ZOOKEEPER-107 (according to 
git-blame). I think we should avoid them, though lets do that in another ticket.

In conclusion, if you have a bad resolver or bogus reverse lookups (as is the 
case in my test scenario): you'll have issues because of these calls.

> Re-try DNS hostname -> IP resolution if node connection fails
> -
>
> Key: ZOOKEEPER-1506
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1506
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: server
>Affects Versions: 3.4.5
> Environment: Ubuntu 11.04 64-bit
>Reporter: Mike Heffner
>Assignee: Michi Mutsuzaki
>Priority: Critical
>  Labels: patch
> Fix For: 3.4.7, 3.5.1, 3.6.0
>
> Attachments: ZOOKEEPER-1506-fix.patch, ZOOKEEPER-1506.patch, 
> ZOOKEEPER-1506.patch, ZOOKEEPER-1506.patch, ZOOKEEPER-1506.patch, 
> ZOOKEEPER-1506.patch, ZOOKEEPER-1506.patch, ZOOKEEPER-1506.patch, 
> zk-dns-caching-refresh.patch
>
>
> In our zoo.cfg we use hostnames to identify the ZK servers that are part of 
> an ensemble. These hostnames are configured with a low (<= 60s) TTL and the 
> IP address they map to can and does change. Our procedure for 
> replacing/upgrading a ZK node is to boot an entirely new instance and remap 
> the hostname to the new instance's IP address. Our expectation is that when 
> the original ZK node is terminated/shutdown, the remaining nodes in the 
> ensemble would reconnect to the new instance.
> However, what we are noticing is that the remaining ZK nodes do not attempt 
> to re-resolve the hostname->IP mapping for the new server. Once the original 
> ZK node is terminated, the existing servers continue to attempt contacting it 
> at the old IP address. It would be great if the ZK servers could try to 
> re-resolve the hostname when attempting to connect to a lost ZK server, 
> instead of caching the lookup indefinitely. Currently we must do a rolling 
> restart of the ZK ensemble after swapping a node -- which at three nodes 
> means we periodically lose quorum.
> The exact method we are following is to boot new instances in EC2 and attach 
> one, of a set of three, Elastic IP address. External to EC2 this IP address 
> remains the same and maps to whatever instance it is attached to. Internal to 
> EC2, the elastic IP hostname has a TTL of about 45-60 seconds and is remapped 
> to the internal (10.x.y.z) address of the instance it is attached to. 
> Therefore, in our case we would like ZK to pickup the new 10.x.y.z address 
> that the elastic IP hostname gets mapped to and reconnect appropriately.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (ZOOKEEPER-2171) avoid reverse lookups in QuorumCnxManager

2015-04-20 Thread Raul Gutierrez Segales (JIRA)
Raul Gutierrez Segales created ZOOKEEPER-2171:
-

 Summary: avoid reverse lookups in QuorumCnxManager
 Key: ZOOKEEPER-2171
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2171
 Project: ZooKeeper
  Issue Type: Bug
  Components: quorum
Reporter: Raul Gutierrez Segales
Assignee: Raul Gutierrez Segales


Apparently, ZOOKEEPER-107 (via a quick git-blame look) introduced a bunch of 
getHostName() calls in QCM. Besides the overhead, these can cause problems when 
mixed with failing/mis-configured DNS servers.

It would be nice to reduce them, if that doesn't affect operational 
correctness. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-2171) avoid reverse lookups in QuorumCnxManager

2015-04-20 Thread Raul Gutierrez Segales (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14504019#comment-14504019
 ] 

Raul Gutierrez Segales commented on ZOOKEEPER-2171:
---

For background/reference see:

https://issues.apache.org/jira/browse/ZOOKEEPER-1666
https://issues.apache.org/jira/browse/ZOOKEEPER-1506

> avoid reverse lookups in QuorumCnxManager
> -
>
> Key: ZOOKEEPER-2171
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2171
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: quorum
>Reporter: Raul Gutierrez Segales
>Assignee: Raul Gutierrez Segales
>
> Apparently, ZOOKEEPER-107 (via a quick git-blame look) introduced a bunch of 
> getHostName() calls in QCM. Besides the overhead, these can cause problems 
> when mixed with failing/mis-configured DNS servers.
> It would be nice to reduce them, if that doesn't affect operational 
> correctness. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-2163) Introduce new ZNode type: container

2015-04-27 Thread Raul Gutierrez Segales (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14516139#comment-14516139
 ] 

Raul Gutierrez Segales commented on ZOOKEEPER-2163:
---

I want to do another — more detailed — pass but I think it generally looks 
good. Overloading the ephemeralOwner field is something that probably needs to 
be documented carefully and we need to think about how to make container znodes 
discoverable (i.e.: how can operators know which znodes are container ones?). 

Thoughts?

cc: [~michim], [~phunt], [~fpj], [~hdeng], [~rakeshr]

> Introduce new ZNode type: container
> ---
>
> Key: ZOOKEEPER-2163
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2163
> Project: ZooKeeper
>  Issue Type: New Feature
>  Components: c client, java client, server
>Affects Versions: 3.5.0
>Reporter: Jordan Zimmerman
>Assignee: Jordan Zimmerman
> Attachments: zookeeper-2163.patch
>
>
> BACKGROUND
> 
> A recurring problem for ZooKeeper users is garbage collection of parent 
> nodes. Many recipes (e.g. locks, leaders, etc.) call for the creation of a 
> parent node under which participants create sequential nodes. When the 
> participant is done, it deletes its node. In practice, the ZooKeeper tree 
> begins to fill up with orphaned parent nodes that are no longer needed. The 
> ZooKeeper APIs don’t provide a way to clean these. Over time, ZooKeeper can 
> become unstable due to the number of these nodes.
> CURRENT SOLUTIONS
> ===
> Apache Curator has a workaround solution for this by providing the Reaper 
> class which runs in the background looking for orphaned parent nodes and 
> deleting them. This isn’t ideal and it would be better if ZooKeeper supported 
> this directly.
> PROPOSAL
> =
> ZOOKEEPER-723 and ZOOKEEPER-834 have been proposed to allow EPHEMERAL nodes 
> to contain child nodes. This is not optimum as EPHEMERALs are tied to a 
> session and the general use case of parent nodes is for PERSISTENT nodes. 
> This proposal adds a new node type, CONTAINER. A CONTAINER node is the same 
> as a PERSISTENT node with the additional property that when its last child is 
> deleted, it is deleted (and CONTAINER nodes recursively up the tree are 
> deleted if empty).
> CANONICAL USAGE
> 
> {code}
> while ( true) { // or some reasonable limit
> try {
> zk.create(path, ...);
> break;
> } catch ( KeeperException.NoNodeException e ) {
> try {
> zk.createContainer(containerPath, ...);
> } catch ( KeeperException.NodeExistsException ignore) {
>}
> }
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-2163) Introduce new ZNode type: container

2015-04-28 Thread Raul Gutierrez Segales (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14517429#comment-14517429
 ] 

Raul Gutierrez Segales commented on ZOOKEEPER-2163:
---

Oh - right. Also, not sure if this is new:

{noformat}
.
 [exec] [junit] 2015-04-17 17:36:23,750 [myid:] - ERROR 
[QuorumPeer[myid=4](plain=/0:0:0:0:0:0:0:0:11235)(secure=disabled):QuorumPeer@1394]
 - writeToDisk == true but configFilename == null
.
{noformat}

Though that error statement doesn't actually do any proper error handling in 
QuorumPeer#setQuorumVerifier. I'll follow-up w/ [~shralex] on another ticket. 

> Introduce new ZNode type: container
> ---
>
> Key: ZOOKEEPER-2163
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2163
> Project: ZooKeeper
>  Issue Type: New Feature
>  Components: c client, java client, server
>Affects Versions: 3.5.0
>Reporter: Jordan Zimmerman
>Assignee: Jordan Zimmerman
> Attachments: zookeeper-2163.patch
>
>
> BACKGROUND
> 
> A recurring problem for ZooKeeper users is garbage collection of parent 
> nodes. Many recipes (e.g. locks, leaders, etc.) call for the creation of a 
> parent node under which participants create sequential nodes. When the 
> participant is done, it deletes its node. In practice, the ZooKeeper tree 
> begins to fill up with orphaned parent nodes that are no longer needed. The 
> ZooKeeper APIs don’t provide a way to clean these. Over time, ZooKeeper can 
> become unstable due to the number of these nodes.
> CURRENT SOLUTIONS
> ===
> Apache Curator has a workaround solution for this by providing the Reaper 
> class which runs in the background looking for orphaned parent nodes and 
> deleting them. This isn’t ideal and it would be better if ZooKeeper supported 
> this directly.
> PROPOSAL
> =
> ZOOKEEPER-723 and ZOOKEEPER-834 have been proposed to allow EPHEMERAL nodes 
> to contain child nodes. This is not optimum as EPHEMERALs are tied to a 
> session and the general use case of parent nodes is for PERSISTENT nodes. 
> This proposal adds a new node type, CONTAINER. A CONTAINER node is the same 
> as a PERSISTENT node with the additional property that when its last child is 
> deleted, it is deleted (and CONTAINER nodes recursively up the tree are 
> deleted if empty).
> CANONICAL USAGE
> 
> {code}
> while ( true) { // or some reasonable limit
> try {
> zk.create(path, ...);
> break;
> } catch ( KeeperException.NoNodeException e ) {
> try {
> zk.createContainer(containerPath, ...);
> } catch ( KeeperException.NodeExistsException ignore) {
>}
> }
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-2163) Introduce new ZNode type: container

2015-04-28 Thread Raul Gutierrez Segales (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14517453#comment-14517453
 ] 

Raul Gutierrez Segales commented on ZOOKEEPER-2163:
---

(fwiw, following up on the reconfig messages labeled as error (that probably 
should just be warnings) here: ZOOKEEPER-2176)

> Introduce new ZNode type: container
> ---
>
> Key: ZOOKEEPER-2163
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2163
> Project: ZooKeeper
>  Issue Type: New Feature
>  Components: c client, java client, server
>Affects Versions: 3.5.0
>Reporter: Jordan Zimmerman
>Assignee: Jordan Zimmerman
> Attachments: zookeeper-2163.patch
>
>
> BACKGROUND
> 
> A recurring problem for ZooKeeper users is garbage collection of parent 
> nodes. Many recipes (e.g. locks, leaders, etc.) call for the creation of a 
> parent node under which participants create sequential nodes. When the 
> participant is done, it deletes its node. In practice, the ZooKeeper tree 
> begins to fill up with orphaned parent nodes that are no longer needed. The 
> ZooKeeper APIs don’t provide a way to clean these. Over time, ZooKeeper can 
> become unstable due to the number of these nodes.
> CURRENT SOLUTIONS
> ===
> Apache Curator has a workaround solution for this by providing the Reaper 
> class which runs in the background looking for orphaned parent nodes and 
> deleting them. This isn’t ideal and it would be better if ZooKeeper supported 
> this directly.
> PROPOSAL
> =
> ZOOKEEPER-723 and ZOOKEEPER-834 have been proposed to allow EPHEMERAL nodes 
> to contain child nodes. This is not optimum as EPHEMERALs are tied to a 
> session and the general use case of parent nodes is for PERSISTENT nodes. 
> This proposal adds a new node type, CONTAINER. A CONTAINER node is the same 
> as a PERSISTENT node with the additional property that when its last child is 
> deleted, it is deleted (and CONTAINER nodes recursively up the tree are 
> deleted if empty).
> CANONICAL USAGE
> 
> {code}
> while ( true) { // or some reasonable limit
> try {
> zk.create(path, ...);
> break;
> } catch ( KeeperException.NoNodeException e ) {
> try {
> zk.createContainer(containerPath, ...);
> } catch ( KeeperException.NodeExistsException ignore) {
>}
> }
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (ZOOKEEPER-2176) unclear error message

2015-04-28 Thread Raul Gutierrez Segales (JIRA)
Raul Gutierrez Segales created ZOOKEEPER-2176:
-

 Summary: unclear error message
 Key: ZOOKEEPER-2176
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2176
 Project: ZooKeeper
  Issue Type: Improvement
  Components: quorum
Affects Versions: 3.5.0, 3.5.1, 3.5.2
Reporter: Raul Gutierrez Segales


Hi [~shralex],

Looking at the CI output of ZOOKEEPER-2163 I see this:

{noformat}
 [exec] [junit] 2015-04-17 17:36:23,750 [myid:] - ERROR 
[QuorumPeer[myid=4](plain=/0:0:0:0:0:0:0:0:11235)(secure=disabled):QuorumPeer@1394]
 - writeToDisk == true but configFilename == null
{noformat}

Though looking at QuorumPeer#setQuorumVerifier I see:

{noformat}
if (configFilename != null) {
try {
String dynamicConfigFilename = makeDynamicConfigFilename(
qv.getVersion());
QuorumPeerConfig.writeDynamicConfig(
dynamicConfigFilename, qv, false);
QuorumPeerConfig.editStaticConfig(configFilename,
dynamicConfigFilename,
needEraseClientInfoFromStaticConfig());
} catch (IOException e) {
LOG.error("Error closing file: ", e.getMessage());
}
} else {
LOG.error("writeToDisk == true but configFilename == null");
}
{noformat}

there's no proper error handling so I guess maybe we should just make it a 
warning? Thoughts?




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-2163) Introduce new ZNode type: container

2015-04-28 Thread Raul Gutierrez Segales (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14518016#comment-14518016
 ] 

Raul Gutierrez Segales commented on ZOOKEEPER-2163:
---

But that can be done/merged separately, right?

> Introduce new ZNode type: container
> ---
>
> Key: ZOOKEEPER-2163
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2163
> Project: ZooKeeper
>  Issue Type: New Feature
>  Components: c client, java client, server
>Affects Versions: 3.5.0
>Reporter: Jordan Zimmerman
>Assignee: Jordan Zimmerman
> Attachments: zookeeper-2163.3.patch
>
>
> BACKGROUND
> 
> A recurring problem for ZooKeeper users is garbage collection of parent 
> nodes. Many recipes (e.g. locks, leaders, etc.) call for the creation of a 
> parent node under which participants create sequential nodes. When the 
> participant is done, it deletes its node. In practice, the ZooKeeper tree 
> begins to fill up with orphaned parent nodes that are no longer needed. The 
> ZooKeeper APIs don’t provide a way to clean these. Over time, ZooKeeper can 
> become unstable due to the number of these nodes.
> CURRENT SOLUTIONS
> ===
> Apache Curator has a workaround solution for this by providing the Reaper 
> class which runs in the background looking for orphaned parent nodes and 
> deleting them. This isn’t ideal and it would be better if ZooKeeper supported 
> this directly.
> PROPOSAL
> =
> ZOOKEEPER-723 and ZOOKEEPER-834 have been proposed to allow EPHEMERAL nodes 
> to contain child nodes. This is not optimum as EPHEMERALs are tied to a 
> session and the general use case of parent nodes is for PERSISTENT nodes. 
> This proposal adds a new node type, CONTAINER. A CONTAINER node is the same 
> as a PERSISTENT node with the additional property that when its last child is 
> deleted, it is deleted (and CONTAINER nodes recursively up the tree are 
> deleted if empty).
> CANONICAL USAGE
> 
> {code}
> while ( true) { // or some reasonable limit
> try {
> zk.create(path, ...);
> break;
> } catch ( KeeperException.NoNodeException e ) {
> try {
> zk.createContainer(containerPath, ...);
> } catch ( KeeperException.NodeExistsException ignore) {
>}
> }
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (ZOOKEEPER-2176) unclear error message should be info or warn

2015-04-28 Thread Raul Gutierrez Segales (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raul Gutierrez Segales updated ZOOKEEPER-2176:
--
Summary: unclear error message should be info or warn  (was: unclear error 
message)

> unclear error message should be info or warn
> 
>
> Key: ZOOKEEPER-2176
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2176
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: quorum
>Affects Versions: 3.5.0, 3.5.1, 3.5.2
>Reporter: Raul Gutierrez Segales
>
> Hi [~shralex],
> Looking at the CI output of ZOOKEEPER-2163 I see this:
> {noformat}
>  [exec] [junit] 2015-04-17 17:36:23,750 [myid:] - ERROR 
> [QuorumPeer[myid=4](plain=/0:0:0:0:0:0:0:0:11235)(secure=disabled):QuorumPeer@1394]
>  - writeToDisk == true but configFilename == null
> {noformat}
> Though looking at QuorumPeer#setQuorumVerifier I see:
> {noformat}
> if (configFilename != null) {
> try {
> String dynamicConfigFilename = makeDynamicConfigFilename(
> qv.getVersion());
> QuorumPeerConfig.writeDynamicConfig(
> dynamicConfigFilename, qv, false);
> QuorumPeerConfig.editStaticConfig(configFilename,
> dynamicConfigFilename,
> needEraseClientInfoFromStaticConfig());
> } catch (IOException e) {
> LOG.error("Error closing file: ", e.getMessage());
> }
> } else {
> LOG.error("writeToDisk == true but configFilename == null");
> }
> {noformat}
> there's no proper error handling so I guess maybe we should just make it a 
> warning? Thoughts?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (ZOOKEEPER-2176) unclear error message should be info or warn

2015-04-28 Thread Raul Gutierrez Segales (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raul Gutierrez Segales updated ZOOKEEPER-2176:
--
Attachment: ZOOKEEPER-2176.patch

change log level from error to info. 

> unclear error message should be info or warn
> 
>
> Key: ZOOKEEPER-2176
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2176
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: quorum
>Affects Versions: 3.5.0, 3.5.1, 3.5.2
>Reporter: Raul Gutierrez Segales
> Attachments: ZOOKEEPER-2176.patch
>
>
> Hi [~shralex],
> Looking at the CI output of ZOOKEEPER-2163 I see this:
> {noformat}
>  [exec] [junit] 2015-04-17 17:36:23,750 [myid:] - ERROR 
> [QuorumPeer[myid=4](plain=/0:0:0:0:0:0:0:0:11235)(secure=disabled):QuorumPeer@1394]
>  - writeToDisk == true but configFilename == null
> {noformat}
> Though looking at QuorumPeer#setQuorumVerifier I see:
> {noformat}
> if (configFilename != null) {
> try {
> String dynamicConfigFilename = makeDynamicConfigFilename(
> qv.getVersion());
> QuorumPeerConfig.writeDynamicConfig(
> dynamicConfigFilename, qv, false);
> QuorumPeerConfig.editStaticConfig(configFilename,
> dynamicConfigFilename,
> needEraseClientInfoFromStaticConfig());
> } catch (IOException e) {
> LOG.error("Error closing file: ", e.getMessage());
> }
> } else {
> LOG.error("writeToDisk == true but configFilename == null");
> }
> {noformat}
> there's no proper error handling so I guess maybe we should just make it a 
> warning? Thoughts?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-2176) unclear error message should be info or warn

2015-04-28 Thread Raul Gutierrez Segales (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14518526#comment-14518526
 ] 

Raul Gutierrez Segales commented on ZOOKEEPER-2176:
---

cc: [~michim], [~hdeng]

> unclear error message should be info or warn
> 
>
> Key: ZOOKEEPER-2176
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2176
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: quorum
>Affects Versions: 3.5.0, 3.5.1, 3.5.2
>Reporter: Raul Gutierrez Segales
>Assignee: Raul Gutierrez Segales
> Attachments: ZOOKEEPER-2176.patch
>
>
> Hi [~shralex],
> Looking at the CI output of ZOOKEEPER-2163 I see this:
> {noformat}
>  [exec] [junit] 2015-04-17 17:36:23,750 [myid:] - ERROR 
> [QuorumPeer[myid=4](plain=/0:0:0:0:0:0:0:0:11235)(secure=disabled):QuorumPeer@1394]
>  - writeToDisk == true but configFilename == null
> {noformat}
> Though looking at QuorumPeer#setQuorumVerifier I see:
> {noformat}
> if (configFilename != null) {
> try {
> String dynamicConfigFilename = makeDynamicConfigFilename(
> qv.getVersion());
> QuorumPeerConfig.writeDynamicConfig(
> dynamicConfigFilename, qv, false);
> QuorumPeerConfig.editStaticConfig(configFilename,
> dynamicConfigFilename,
> needEraseClientInfoFromStaticConfig());
> } catch (IOException e) {
> LOG.error("Error closing file: ", e.getMessage());
> }
> } else {
> LOG.error("writeToDisk == true but configFilename == null");
> }
> {noformat}
> there's no proper error handling so I guess maybe we should just make it a 
> warning? Thoughts?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-2171) avoid reverse lookups in QuorumCnxManager

2015-04-28 Thread Raul Gutierrez Segales (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14518534#comment-14518534
 ] 

Raul Gutierrez Segales commented on ZOOKEEPER-2171:
---

Yeah, so git grep says:

{noformat}
src/java/main/org/apache/zookeeper/ClientCnxn.java:"(" + 
addr.getHostName() + ":" + addr.getPort() + ")"));
src/java/main/org/apache/zookeeper/ClientCnxn.java: 
   principalUserName+"/"+addr.getHostName());
src/java/main/org/apache/zookeeper/ClientCnxn.java:sock = new 
Socket(addr.getHostName(), addr.getPort());
src/java/main/org/apache/zookeeper/ClientCnxn.java:+ 
addr.getHostName() + ":" + addr.getPort());
src/java/main/org/apache/zookeeper/client/StaticHostProvider.java:  
   address.getHostName();
src/java/main/org/apache/zookeeper/client/StaticHostProvider.java:  
  .getHostName().equals(myServer.getHostName( {
src/java/main/org/apache/zookeeper/client/ZooKeeperSaslClient.java: 
   final String serviceHostname = serviceKerberosName.getHostName();
src/java/main/org/apache/zookeeper/common/HostNameUtils.java:   
socketAddress.getHostName();
src/java/main/org/apache/zookeeper/server/auth/KerberosName.java:  public 
String getHostName() {
src/java/main/org/apache/zookeeper/server/auth/SaslServerCallbackHandler.java:  
  userNameBuilder.append("/").append(kerberosName.getHostN
src/java/main/org/apache/zookeeper/server/auth/SaslServerCallbackHandler.java:  
  return !isSystemPropertyTrue(SYSPROP_REMOVE_HOST) && kerberosNam
src/java/main/org/apache/zookeeper/server/quorum/QuorumCnxManager.java: 
   String addr = self.getElectionAddress().getHostName() + ":" + self.
src/java/main/org/apache/zookeeper/server/quorum/RemotePeerBean.java:
return peer.addr.getHostName()+":"+peer.addr.getPort();
src/java/main/org/apache/zookeeper/server/util/ConfigUtils.java: 
sb.append(qs.clientAddr.getHostName() + ":" + qs.clientAddr.getPort());
src/java/systest/org/apache/zookeeper/test/system/QuorumPeerInstance.java:  
  String report = clientAddr.getHostName() + ':' + clientAddr.getP
src/java/systest/org/apache/zookeeper/test/system/QuorumPeerInstance.java:  
  ',' + quorumLeaderAddr.getHostName() + ':' + quorumLeaderAddr.ge
src/java/test/org/apache/zookeeper/test/CnxManagerTest.java:String addr 
= otherAddr.getHostName()+ ":" + otherAddr.getPort();
src/java/test/org/apache/zookeeper/test/ConnectStringParserTest.java:
Assert.assertEquals("10.10.10.1", parser.getServerAddresses().get(0).getH
src/java/test/org/apache/zookeeper/test/ConnectStringParserTest.java:
Assert.assertEquals("10.10.10.2", parser.getServerAddresses().get(1).getH
src/java/test/org/apache/zookeeper/test/ConnectStringParserTest.java:
Assert.assertEquals("10.10.10.1", parser.getServerAddresses().get(0).getH
src/java/test/org/apache/zookeeper/test/ConnectStringParserTest.java:
Assert.assertEquals("10.10.10.2", parser.getServerAddresses().get(1).getH
src/java/test/org/apache/zookeeper/test/ReconfigTest.java:  
Assert.assertEquals(qs.clientAddr.getHostName(), "0.0.0.0");
src/java/test/org/apache/zookeeper/test/ReconfigTest.java:  
  qs.addr.getHostName() + ":"
src/java/test/org/apache/zookeeper/test/StaticHostProviderTest.java:
String hostname = next.getHostName();
src/recipes/election/src/java/org/apache/zookeeper/recipes/leader/LeaderElectionSupport.java:
  return leaderOffers.get(0).getHostName();
src/recipes/election/src/java/org/apache/zookeeper/recipes/leader/LeaderElectionSupport.java:
  public String getHostName() {
src/recipes/election/src/java/org/apache/zookeeper/recipes/leader/LeaderOffer.java:
  public String getHostName() {
{noformat}

Though I think it's only in the QCM where it's specially annoying, because a 
failed DNS lookup could prevent a peer from joining the cluster after a 
partitioning. I feel less strong about the other cases, though we could change 
them in the same patch too.


> avoid reverse lookups in QuorumCnxManager
> -
>
> Key: ZOOKEEPER-2171
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2171
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: quorum
>Reporter: Raul Gutierrez Segales
>Assignee: Raul Gutierrez Segales
> Fix For: 3.5.1, 3.6.0
>
>
> Apparently, ZOOKEEPER-107 (via a quick git-blame look) introduced a bunch of 
> getHostName() calls in QCM. Besides the overhead, these can cause problems 
> when mixed with failing/mis-configured DNS servers.
> It would be nice to reduce them, if that doesn't affect operational 
> correctness. 



--
This message 

[jira] [Commented] (ZOOKEEPER-2171) avoid reverse lookups in QuorumCnxManager

2015-04-28 Thread Raul Gutierrez Segales (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14518542#comment-14518542
 ] 

Raul Gutierrez Segales commented on ZOOKEEPER-2171:
---

Why is it that we can remove 
src/java/main/org/apache/zookeeper/common/HostNameUtils.java? Is Java 6 not 
supported anymore?

> avoid reverse lookups in QuorumCnxManager
> -
>
> Key: ZOOKEEPER-2171
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2171
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: quorum
>Reporter: Raul Gutierrez Segales
>Assignee: Raul Gutierrez Segales
> Fix For: 3.5.1, 3.6.0
>
>
> Apparently, ZOOKEEPER-107 (via a quick git-blame look) introduced a bunch of 
> getHostName() calls in QCM. Besides the overhead, these can cause problems 
> when mixed with failing/mis-configured DNS servers.
> It would be nice to reduce them, if that doesn't affect operational 
> correctness. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (ZOOKEEPER-2171) avoid reverse lookups in QuorumCnxManager

2015-04-28 Thread Raul Gutierrez Segales (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raul Gutierrez Segales updated ZOOKEEPER-2171:
--
Attachment: ZOOKEEPER-2171.patch

This only fixes the calls in QCM. There are many other places where this 
happens (as mentioned by [~fournc]), though  doing it in the QCM is specially 
bad because it could prevent a quorum from forming if DNS is unavailable during 
that time.

Also, it does not remove 
src/java/main/org/apache/zookeeper/common/HostNameUtils.java so that we can 
keep commits that break Java 6 self-contained. 

> avoid reverse lookups in QuorumCnxManager
> -
>
> Key: ZOOKEEPER-2171
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2171
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: quorum
>Reporter: Raul Gutierrez Segales
>Assignee: Raul Gutierrez Segales
> Fix For: 3.5.1, 3.6.0
>
> Attachments: ZOOKEEPER-2171.patch
>
>
> Apparently, ZOOKEEPER-107 (via a quick git-blame look) introduced a bunch of 
> getHostName() calls in QCM. Besides the overhead, these can cause problems 
> when mixed with failing/mis-configured DNS servers.
> It would be nice to reduce them, if that doesn't affect operational 
> correctness. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-2171) avoid reverse lookups in QuorumCnxManager

2015-04-29 Thread Raul Gutierrez Segales (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14519639#comment-14519639
 ] 

Raul Gutierrez Segales commented on ZOOKEEPER-2171:
---

Ah - great! I'll update the patch to use getHostString everywhere and drop the 
utility method for java6.

> avoid reverse lookups in QuorumCnxManager
> -
>
> Key: ZOOKEEPER-2171
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2171
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: quorum
>Reporter: Raul Gutierrez Segales
>Assignee: Raul Gutierrez Segales
> Fix For: 3.5.1, 3.6.0
>
> Attachments: ZOOKEEPER-2171.patch
>
>
> Apparently, ZOOKEEPER-107 (via a quick git-blame look) introduced a bunch of 
> getHostName() calls in QCM. Besides the overhead, these can cause problems 
> when mixed with failing/mis-configured DNS servers.
> It would be nice to reduce them, if that doesn't affect operational 
> correctness. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (ZOOKEEPER-2171) avoid reverse lookups in QuorumCnxManager

2015-04-29 Thread Raul Gutierrez Segales (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raul Gutierrez Segales updated ZOOKEEPER-2171:
--
Attachment: ZOOKEEPER-2171.patch

This replaces all getHostName calls for getHostString (which does not perform 
reverse lookups).

In addition to that, it removes HostNameUtils since Java 6 isn't supported 
anymore (ZOOKEEPER-1963).

cc: [~michim], [~rakeshr]


> avoid reverse lookups in QuorumCnxManager
> -
>
> Key: ZOOKEEPER-2171
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2171
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: quorum
>Reporter: Raul Gutierrez Segales
>Assignee: Raul Gutierrez Segales
> Fix For: 3.5.1, 3.6.0
>
> Attachments: ZOOKEEPER-2171.patch, ZOOKEEPER-2171.patch
>
>
> Apparently, ZOOKEEPER-107 (via a quick git-blame look) introduced a bunch of 
> getHostName() calls in QCM. Besides the overhead, these can cause problems 
> when mixed with failing/mis-configured DNS servers.
> It would be nice to reduce them, if that doesn't affect operational 
> correctness. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-2171) avoid reverse lookups in QuorumCnxManager

2015-04-29 Thread Raul Gutierrez Segales (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14520571#comment-14520571
 ] 

Raul Gutierrez Segales commented on ZOOKEEPER-2171:
---

Hmm, ant test-core-java passes locally for me but we get this in CI:

{noformat}
Error Message

Address already in use

Stacktrace

java.net.BindException: Address already in use
at sun.nio.ch.Net.bind0(Native Method)
at sun.nio.ch.Net.bind(Net.java:344)
at sun.nio.ch.Net.bind(Net.java:336)
at 
sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:199)
at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74)
at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:67)
at 
org.apache.zookeeper.server.NIOServerCnxnFactory.configure(NIOServerCnxnFactory.java:687)
at 
org.apache.zookeeper.server.ServerCnxnFactory.configure(ServerCnxnFactory.java:75)
at 
org.apache.zookeeper.server.ServerCnxnFactory.createFactory(ServerCnxnFactory.java:152)
at 
org.apache.zookeeper.server.ServerCnxnFactory.createFactory(ServerCnxnFactory.java:145)
at 
org.apache.zookeeper.test.ACLCountTest.testAclCount(ACLCountTest.java:73)
at 
org.apache.zookeeper.JUnit4ZKTestRunner$LoggedInvokeMethod.evaluate(JUnit4ZKTestRunner.java:52)
{noformat}

I'll have it run one more time. 

> avoid reverse lookups in QuorumCnxManager
> -
>
> Key: ZOOKEEPER-2171
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2171
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: quorum
>Reporter: Raul Gutierrez Segales
>Assignee: Raul Gutierrez Segales
> Fix For: 3.5.1, 3.6.0
>
> Attachments: ZOOKEEPER-2171.patch, ZOOKEEPER-2171.patch
>
>
> Apparently, ZOOKEEPER-107 (via a quick git-blame look) introduced a bunch of 
> getHostName() calls in QCM. Besides the overhead, these can cause problems 
> when mixed with failing/mis-configured DNS servers.
> It would be nice to reduce them, if that doesn't affect operational 
> correctness. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-2171) avoid reverse lookups in QuorumCnxManager

2015-04-30 Thread Raul Gutierrez Segales (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14521630#comment-14521630
 ] 

Raul Gutierrez Segales commented on ZOOKEEPER-2171:
---

done - thanks [~rakeshr]!

> avoid reverse lookups in QuorumCnxManager
> -
>
> Key: ZOOKEEPER-2171
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2171
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: quorum
>Reporter: Raul Gutierrez Segales
>Assignee: Raul Gutierrez Segales
> Fix For: 3.5.1, 3.6.0
>
> Attachments: ZOOKEEPER-2171.patch, ZOOKEEPER-2171.patch, 
> ZOOKEEPER-2171.patch
>
>
> Apparently, ZOOKEEPER-107 (via a quick git-blame look) introduced a bunch of 
> getHostName() calls in QCM. Besides the overhead, these can cause problems 
> when mixed with failing/mis-configured DNS servers.
> It would be nice to reduce them, if that doesn't affect operational 
> correctness. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-2171) avoid reverse lookups in QuorumCnxManager

2015-04-30 Thread Raul Gutierrez Segales (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14521796#comment-14521796
 ] 

Raul Gutierrez Segales commented on ZOOKEEPER-2171:
---

Hmmm, failed again, thought at a different place:

{noformat}
Error Message

waiting for server 2 being up

Stacktrace

junit.framework.AssertionFailedError: waiting for server 2 being up
at 
org.apache.zookeeper.server.quorum.ReconfigRecoveryTest.testCurrentObserverIsParticipantInNewConfig(ReconfigRecoveryTest.java:529)
at 
org.apache.zookeeper.JUnit4ZKTestRunner$LoggedInvokeMethod.evaluate(JUnit4ZKTestRunner.java:52)
{noformat}

Could this be a bad CI box? Any ideas? cc: [~michim], [~rakeshr]

> avoid reverse lookups in QuorumCnxManager
> -
>
> Key: ZOOKEEPER-2171
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2171
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: quorum
>Reporter: Raul Gutierrez Segales
>Assignee: Raul Gutierrez Segales
> Fix For: 3.5.1, 3.6.0
>
> Attachments: ZOOKEEPER-2171.patch, ZOOKEEPER-2171.patch, 
> ZOOKEEPER-2171.patch
>
>
> Apparently, ZOOKEEPER-107 (via a quick git-blame look) introduced a bunch of 
> getHostName() calls in QCM. Besides the overhead, these can cause problems 
> when mixed with failing/mis-configured DNS servers.
> It would be nice to reduce them, if that doesn't affect operational 
> correctness. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-2171) avoid reverse lookups in QuorumCnxManager

2015-04-30 Thread Raul Gutierrez Segales (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14521986#comment-14521986
 ] 

Raul Gutierrez Segales commented on ZOOKEEPER-2171:
---

Oh well, passes for me locally (again):

{noformat}
...

BUILD SUCCESSFUL
Total time: 47 minutes 44 seconds
{noformat}

So it does sound more like flaky CI :-(

> avoid reverse lookups in QuorumCnxManager
> -
>
> Key: ZOOKEEPER-2171
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2171
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: quorum
>Reporter: Raul Gutierrez Segales
>Assignee: Raul Gutierrez Segales
> Fix For: 3.5.1, 3.6.0
>
> Attachments: ZOOKEEPER-2171.patch, ZOOKEEPER-2171.patch, 
> ZOOKEEPER-2171.patch
>
>
> Apparently, ZOOKEEPER-107 (via a quick git-blame look) introduced a bunch of 
> getHostName() calls in QCM. Besides the overhead, these can cause problems 
> when mixed with failing/mis-configured DNS servers.
> It would be nice to reduce them, if that doesn't affect operational 
> correctness. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-2171) avoid reverse lookups in QuorumCnxManager

2015-04-30 Thread Raul Gutierrez Segales (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14522086#comment-14522086
 ] 

Raul Gutierrez Segales commented on ZOOKEEPER-2171:
---

That was test-core-java, ant test (the full suite) passes too:

{noformat}
...

BUILD SUCCESSFUL
Total time: 52 minutes 9 seconds

{noformat}

I'll resubmit the patch, cancel & submit and trigger the build again. Is there 
an easier way to re-kick the build?

> avoid reverse lookups in QuorumCnxManager
> -
>
> Key: ZOOKEEPER-2171
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2171
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: quorum
>Reporter: Raul Gutierrez Segales
>Assignee: Raul Gutierrez Segales
> Fix For: 3.5.1, 3.6.0
>
> Attachments: ZOOKEEPER-2171.patch, ZOOKEEPER-2171.patch, 
> ZOOKEEPER-2171.patch
>
>
> Apparently, ZOOKEEPER-107 (via a quick git-blame look) introduced a bunch of 
> getHostName() calls in QCM. Besides the overhead, these can cause problems 
> when mixed with failing/mis-configured DNS servers.
> It would be nice to reduce them, if that doesn't affect operational 
> correctness. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (ZOOKEEPER-2171) avoid reverse lookups in QuorumCnxManager

2015-04-30 Thread Raul Gutierrez Segales (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raul Gutierrez Segales updated ZOOKEEPER-2171:
--
Attachment: ZOOKEEPER-2171.patch

trying to restart the build 

> avoid reverse lookups in QuorumCnxManager
> -
>
> Key: ZOOKEEPER-2171
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2171
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: quorum
>Reporter: Raul Gutierrez Segales
>Assignee: Raul Gutierrez Segales
> Fix For: 3.5.1, 3.6.0
>
> Attachments: ZOOKEEPER-2171.patch, ZOOKEEPER-2171.patch, 
> ZOOKEEPER-2171.patch, ZOOKEEPER-2171.patch
>
>
> Apparently, ZOOKEEPER-107 (via a quick git-blame look) introduced a bunch of 
> getHostName() calls in QCM. Besides the overhead, these can cause problems 
> when mixed with failing/mis-configured DNS servers.
> It would be nice to reduce them, if that doesn't affect operational 
> correctness. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-2176) unclear error message should be info or warn

2015-04-30 Thread Raul Gutierrez Segales (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14522294#comment-14522294
 ] 

Raul Gutierrez Segales commented on ZOOKEEPER-2176:
---

Well, changing a logging statement triggered a build failure. Clearly there's 
something up w/ CI. 

> unclear error message should be info or warn
> 
>
> Key: ZOOKEEPER-2176
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2176
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: quorum
>Affects Versions: 3.5.0, 3.5.1, 3.5.2
>Reporter: Raul Gutierrez Segales
>Assignee: Raul Gutierrez Segales
> Attachments: ZOOKEEPER-2176.patch
>
>
> Hi [~shralex],
> Looking at the CI output of ZOOKEEPER-2163 I see this:
> {noformat}
>  [exec] [junit] 2015-04-17 17:36:23,750 [myid:] - ERROR 
> [QuorumPeer[myid=4](plain=/0:0:0:0:0:0:0:0:11235)(secure=disabled):QuorumPeer@1394]
>  - writeToDisk == true but configFilename == null
> {noformat}
> Though looking at QuorumPeer#setQuorumVerifier I see:
> {noformat}
> if (configFilename != null) {
> try {
> String dynamicConfigFilename = makeDynamicConfigFilename(
> qv.getVersion());
> QuorumPeerConfig.writeDynamicConfig(
> dynamicConfigFilename, qv, false);
> QuorumPeerConfig.editStaticConfig(configFilename,
> dynamicConfigFilename,
> needEraseClientInfoFromStaticConfig());
> } catch (IOException e) {
> LOG.error("Error closing file: ", e.getMessage());
> }
> } else {
> LOG.error("writeToDisk == true but configFilename == null");
> }
> {noformat}
> there's no proper error handling so I guess maybe we should just make it a 
> warning? Thoughts?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-1866) ClientBase#createClient is failing frequently

2015-04-30 Thread Raul Gutierrez Segales (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14522315#comment-14522315
 ] 

Raul Gutierrez Segales commented on ZOOKEEPER-1866:
---

Hi [~abranzyck], [~fpj], & [~rakeshr]:

ZOOKEEPER-1872 has been merged for the 3.4 branch - is this still happening? 
I'd like to get things going for the 3.4.7 release. Thanks!

> ClientBase#createClient is failing frequently
> -
>
> Key: ZOOKEEPER-1866
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1866
> Project: ZooKeeper
>  Issue Type: Sub-task
>  Components: tests
>Affects Versions: 3.4.5
>Reporter: Rakesh R
>Assignee: Germán Blanco
> Fix For: 3.4.7
>
> Attachments: ZOOKEEPER-1866.patch
>
>
> Following failure pattern has been observed many times in windows build. 
> After creating the zookeeper client, the respective connection bean is not 
> available in the jmx beans and is failing the tests.
> {code}
> [junit] 2014-01-22 08:58:22,625 [myid:] - INFO  [main:ZKTestCase$1@65] - 
> FAILED testInvalidVersion
> [junit] junit.framework.AssertionFailedError: expected 
> [0x143b92b0333] expected:<1> but was:<0>
> [junit]   at junit.framework.Assert.fail(Assert.java:47)
> [junit]   at junit.framework.Assert.failNotEquals(Assert.java:283)
> [junit]   at junit.framework.Assert.assertEquals(Assert.java:64)
> [junit]   at junit.framework.Assert.assertEquals(Assert.java:195)
> [junit]   at org.apache.zookeeper.test.JMXEnv.ensureAll(JMXEnv.java:124)
> [junit]   at 
> org.apache.zookeeper.test.ClientBase.createClient(ClientBase.java:191)
> [junit]   at 
> org.apache.zookeeper.test.ClientBase.createClient(ClientBase.java:171)
> [junit]   at 
> org.apache.zookeeper.test.ClientBase.createClient(ClientBase.java:156)
> [junit]   at 
> org.apache.zookeeper.test.ClientBase.createClient(ClientBase.java:149)
> [junit]   at 
> org.apache.zookeeper.test.MultiTransactionTest.setUp(MultiTransactionTest.java:60)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-1868) Server not coming back up in QuorumZxidSyncTest

2015-04-30 Thread Raul Gutierrez Segales (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14522330#comment-14522330
 ] 

Raul Gutierrez Segales commented on ZOOKEEPER-1868:
---

Hi [~fpj],

is this still happening for you? A few things have been merged to the 3.4 
branch, so maybe it went away. 

> Server not coming back up in QuorumZxidSyncTest
> ---
>
> Key: ZOOKEEPER-1868
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1868
> Project: ZooKeeper
>  Issue Type: Sub-task
>Reporter: Flavio Junqueira
> Fix For: 3.4.7
>
> Attachments: QuorumZxidSyncTest-output.txt
>
>
> We got this stack trace:
> {noformat}
> [junit] 2014-01-27 09:14:08,481 [myid:] - INFO  [main:ZKTestCase$1@65] - 
> FAILED testLateLogs
> [junit] java.lang.AssertionError: waiting for server up
> [junit]   at org.junit.Assert.fail(Assert.java:91)
> [junit]   at org.junit.Assert.assertTrue(Assert.java:43)
> [junit]   at 
> org.apache.zookeeper.test.QuorumBase.startServers(QuorumBase.java:188)
> [junit]   at 
> org.apache.zookeeper.test.QuorumBase.startServers(QuorumBase.java:113)
> [junit]   at 
> org.apache.zookeeper.test.QuorumZxidSyncTest.testLateLogs(QuorumZxidSyncTest.java:116)
> {noformat}
> which occurs here, when we stop the servers and restart them.
> {noformat}
> qb.shutdownServers();
> qb.startServers();
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-2033) zookeeper follower fails to start after a restart immediately following a new epoch

2015-04-30 Thread Raul Gutierrez Segales (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14522342#comment-14522342
 ] 

Raul Gutierrez Segales commented on ZOOKEEPER-2033:
---

Patch generally lgtm. Could you take a look [~fpj]? I'll try to restart CI as 
well. Thanks!

> zookeeper follower fails to start after a restart immediately following a new 
> epoch
> ---
>
> Key: ZOOKEEPER-2033
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2033
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: quorum
>Affects Versions: 3.4.6
>Reporter: Asad Saeed
>Assignee: Asad Saeed
> Fix For: 3.4.7
>
> Attachments: ZOOKEEPER-2033-3.4.patch, ZOOKEEPER-2033.patch
>
>
> The following issue was seen when adding a new node to a zookeeper cluster.
> Reproduction steps
> 1. Create a 2 node ensemble. Write some keys.
> 2. Add another node to the ensemble, by modifying the config. Restarting 
> entire cluster.
> 3. Restart the new node before writing any new keys.
> What occurs is that the new node gets a SNAP from the newly elected leader, 
> since it is too far behind. The zxid for this snapshot is from the new epoch 
> but that is not in the committed log cache.
> On restart of this new node. The follower sends the new epoch zxid. The 
> leader looks at it's maxCommitted logs, and sees that it is not the newest 
> epoch, and therefore sends a TRUNC.
> The follower sees the TRUNC but it only has a snapshot, so it cannot truncate!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-1402) Upload Zookeeper package to Maven Central

2015-04-30 Thread Raul Gutierrez Segales (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14522391#comment-14522391
 ] 

Raul Gutierrez Segales commented on ZOOKEEPER-1402:
---

Potentially slightly related: 
https://issues.apache.org/jira/browse/ZOOKEEPER-2177. 

That being said, this is not a blocker for 3.4.7 per se but something we need 
to do afterwards. 

> Upload Zookeeper package to Maven Central
> -
>
> Key: ZOOKEEPER-1402
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1402
> Project: ZooKeeper
>  Issue Type: Improvement
>Affects Versions: 3.3.4
>Reporter: Igor Lazebny
>Assignee: Flavio Junqueira
>Priority: Minor
> Fix For: 3.4.7
>
>
> It would be great to make Zookeeper package available in Maven Central as 
> other Apache projects do (Camel, CXF, ActiveMQ, Karaf, etc).
> That would simplify usage of this package in maven builds.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-1853) zkCli.sh can't issue a CREATE command containing spaces in the data

2015-04-30 Thread Raul Gutierrez Segales (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14522531#comment-14522531
 ] 

Raul Gutierrez Segales commented on ZOOKEEPER-1853:
---

Thanks for the patch [~ryanlamore]! Though, could you generate one against the 
3.4 branch, the one you uploaded seems to be against trunk.

Other than that, lgtm +1 



> zkCli.sh can't issue a CREATE command containing spaces in the data
> ---
>
> Key: ZOOKEEPER-1853
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1853
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: java client
>Affects Versions: 3.4.6, 3.5.0
>Reporter: sekine coulibaly
>Assignee: Ryan Lamore
>Priority: Minor
>  Labels: patch
> Fix For: 3.4.7, 3.5.2
>
> Attachments: ZOOKEEPER-1853.patch, ZOOKEEPER-1853.patch, 
> ZOOKEEPER-1853.patch, ZkSpaceMan.java
>
>
> Execute the following command in zkCli.sh :
> create /contacts/1  {"country":"CA","name":"De La Salle"}
> The results is that only {"id":1,"fullname":"De is stored.
> The expected result is to have the full JSON payload stored.
> The CREATE command seems to be croped after the first space of the data 
> payload. When issuing a create command, all arguments not being -s nor -e 
> shall be treated as the actual data.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (ZOOKEEPER-1884) zkCli silently ignores commands with missing parameters

2015-04-30 Thread Raul Gutierrez Segales (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raul Gutierrez Segales reassigned ZOOKEEPER-1884:
-

Assignee: Raul Gutierrez Segales  (was: Flavio Junqueira)

> zkCli silently ignores commands with missing parameters
> ---
>
> Key: ZOOKEEPER-1884
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1884
> Project: ZooKeeper
>  Issue Type: Bug
>Affects Versions: 3.4.6
>Reporter: Flavio Junqueira
>Assignee: Raul Gutierrez Segales
>Priority: Minor
> Fix For: 3.4.7
>
>
> Apparently, we have fixed this in trunk, but not in the 3.4 branch. When we 
> pass only the path to create, the command is not executed because it expects 
> an additional parameter and there is no error message because the create 
> command exists.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (ZOOKEEPER-1884) zkCli silently ignores commands with missing parameters

2015-04-30 Thread Raul Gutierrez Segales (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raul Gutierrez Segales updated ZOOKEEPER-1884:
--
Summary: zkCli silently ignores commands with missing parameters  (was: 
zkCli silently ignores a create when only the path is given)

> zkCli silently ignores commands with missing parameters
> ---
>
> Key: ZOOKEEPER-1884
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1884
> Project: ZooKeeper
>  Issue Type: Bug
>Affects Versions: 3.4.6
>Reporter: Flavio Junqueira
>Assignee: Flavio Junqueira
>Priority: Minor
> Fix For: 3.4.7
>
>
> Apparently, we have fixed this in trunk, but not in the 3.4 branch. When we 
> pass only the path to create, the command is not executed because it expects 
> an additional parameter and there is no error message because the create 
> command exists.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (ZOOKEEPER-1884) zkCli silently ignores commands with missing parameters

2015-04-30 Thread Raul Gutierrez Segales (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raul Gutierrez Segales updated ZOOKEEPER-1884:
--
Attachment: ZOOKEEPER-1884.patch

More generally, zkCli was not issuing warnings when the command existed *but* 
the number of parameters was wrong. This patch changes things so that usage() 
is called if there's no match (regardless of the command being known or not).

cc: [~michim], [~hdeng], [~fpj]

> zkCli silently ignores commands with missing parameters
> ---
>
> Key: ZOOKEEPER-1884
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1884
> Project: ZooKeeper
>  Issue Type: Bug
>Affects Versions: 3.4.6
>Reporter: Flavio Junqueira
>Assignee: Raul Gutierrez Segales
>Priority: Minor
> Fix For: 3.4.7
>
> Attachments: ZOOKEEPER-1884.patch
>
>
> Apparently, we have fixed this in trunk, but not in the 3.4 branch. When we 
> pass only the path to create, the command is not executed because it expects 
> an additional parameter and there is no error message because the create 
> command exists.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-2174) JUnit4ZKTestRunner logs test failure for all exceptions even if the test method is annotated with an expected exception.

2015-04-30 Thread Raul Gutierrez Segales (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14522562#comment-14522562
 ] 

Raul Gutierrez Segales commented on ZOOKEEPER-2174:
---

Thanks for the patch [~cnauroth]! Could you generate one for the 3.4 branch as 
well please?

[~rakeshr]: shall I commit the available one to trunk && 3.5?

> JUnit4ZKTestRunner logs test failure for all exceptions even if the test 
> method is annotated with an expected exception.
> 
>
> Key: ZOOKEEPER-2174
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2174
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: tests
>Reporter: Chris Nauroth
>Assignee: Chris Nauroth
>Priority: Minor
> Fix For: 3.4.7, 3.5.2, 3.6.0
>
> Attachments: ZOOKEEPER-2174.001.patch, ZOOKEEPER-2174.002.patch, 
> ZOOKEEPER-2174.003.patch, ZOOKEEPER-2174.004.patch
>
>
> {{JUnit4ZKTestRunner}} wraps JUnit test method execution, and if any 
> exception is thrown, it logs a message stating that the test failed.  
> However, some ZooKeeper tests are annotated with {{@Test(expected=...)}} to 
> indicate that an exception is the expected result, and thus the test passes.  
> The runner should be aware of expected exceptions and only log if an 
> unexpected exception occurs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-1274) Support child watches to be displayed with 4 letter zookeeper commands (i.e. wchs, wchp and wchc)

2015-04-30 Thread Raul Gutierrez Segales (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14522574#comment-14522574
 ] 

Raul Gutierrez Segales commented on ZOOKEEPER-1274:
---

Hi [~cnauroth] - it would be great if you could rebase the patch on trunk. 
Also, as Patrick mentioned, it would be nice to also extend that command expose 
through the Jetty server to have it have the same functionality. Thanks!

> Support child watches to be displayed with 4 letter zookeeper commands (i.e. 
> wchs, wchp and wchc)
> -
>
> Key: ZOOKEEPER-1274
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1274
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
> Environment: Zookeeper Server
>Reporter: amith
>Assignee: Raul Gutierrez Segales
> Fix For: 3.5.2, 3.6.0
>
> Attachments: 
> 0001-ZOOKEEPER-1274.-Display-child-watches-info-in-watch-.patch, 
> ZOOKEEPER-1274.patch
>
>
> currently only data watchers (created by exists() and getdata() )are getting 
> displayed with wchs,wchp,wchc 4 letter command command 
> It would be useful to get the infomation related to childwatchers ( 
> getChildren() ) also with 4 letter words.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-2174) JUnit4ZKTestRunner logs test failure for all exceptions even if the test method is annotated with an expected exception.

2015-04-30 Thread Raul Gutierrez Segales (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14522815#comment-14522815
 ] 

Raul Gutierrez Segales commented on ZOOKEEPER-2174:
---

Thanks [~cnauroth]!

[~rakeshr], [~hdeng], [~michim]: can I get a +1 for the 3.4 patch as well, and 
then I'll push to all branches - thanks!

> JUnit4ZKTestRunner logs test failure for all exceptions even if the test 
> method is annotated with an expected exception.
> 
>
> Key: ZOOKEEPER-2174
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2174
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: tests
>Reporter: Chris Nauroth
>Assignee: Chris Nauroth
>Priority: Minor
> Fix For: 3.4.7, 3.5.2, 3.6.0
>
> Attachments: ZOOKEEPER-2174-branch-3.4.004.patch, 
> ZOOKEEPER-2174.001.patch, ZOOKEEPER-2174.002.patch, ZOOKEEPER-2174.003.patch, 
> ZOOKEEPER-2174.004.patch
>
>
> {{JUnit4ZKTestRunner}} wraps JUnit test method execution, and if any 
> exception is thrown, it logs a message stating that the test failed.  
> However, some ZooKeeper tests are annotated with {{@Test(expected=...)}} to 
> indicate that an exception is the expected result, and thus the test passes.  
> The runner should be aware of expected exceptions and only log if an 
> unexpected exception occurs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (ZOOKEEPER-1927) zkServer.sh fails to read dataDir (and others) from zoo.cfg on Solaris 10 (grep issue, manifests as FAILED TO WRITE PID).

2015-04-30 Thread Raul Gutierrez Segales (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raul Gutierrez Segales updated ZOOKEEPER-1927:
--
Fix Version/s: 3.4.7

> zkServer.sh fails to read dataDir (and others) from zoo.cfg on Solaris 10 
> (grep issue, manifests as FAILED TO WRITE PID).  
> ---
>
> Key: ZOOKEEPER-1927
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1927
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: scripts
>Affects Versions: 3.4.6
> Environment: Solaris 5.10 
>Reporter: Ed Schmed
>Assignee: Chris Nauroth
> Fix For: 3.4.7
>
> Attachments: ZOOKEEPER-1927.001.patch
>
>
> Fails to write PID file with a permissions error, because the startup script 
> fails to read the dataDir variable from zoo.cfg, and then tries to use the 
> drive root ( / ) as the data dir.
> Tracked the problem down to line 84 of zkServer.sh:
> ZOO_DATADIR="$(grep "^[[:space:]]*dataDir" "$ZOOCFG" | sed -e 's/.*=//')"
> If i run just that line and point it right at the config file, ZOO_DATADIR is 
> empty.
> If I remove [[:space:]]* from the grep:
> ZOO_DATADIR="$(grep "^dataDir" "$ZOOCFG" | sed -e 's/.*=//')"
> Then it works fine. (If I also make the same change on line 164 and 169)
> My regex skills are pretty bad, so I'm afraid to comment on why [[space]]* 
> needs to be in there?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-1927) zkServer.sh fails to read dataDir (and others) from zoo.cfg on Solaris 10 (grep issue, manifests as FAILED TO WRITE PID).

2015-04-30 Thread Raul Gutierrez Segales (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14522817#comment-14522817
 ] 

Raul Gutierrez Segales commented on ZOOKEEPER-1927:
---

+1, thanks [~cnauroth].

[~michim], [~rakeshr], [~hdeng]: it'd be nice to have this for 3.4.7, I'll 
merge once I get enough +1s. 

> zkServer.sh fails to read dataDir (and others) from zoo.cfg on Solaris 10 
> (grep issue, manifests as FAILED TO WRITE PID).  
> ---
>
> Key: ZOOKEEPER-1927
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1927
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: scripts
>Affects Versions: 3.4.6
> Environment: Solaris 5.10 
>Reporter: Ed Schmed
>Assignee: Chris Nauroth
> Fix For: 3.4.7
>
> Attachments: ZOOKEEPER-1927.001.patch
>
>
> Fails to write PID file with a permissions error, because the startup script 
> fails to read the dataDir variable from zoo.cfg, and then tries to use the 
> drive root ( / ) as the data dir.
> Tracked the problem down to line 84 of zkServer.sh:
> ZOO_DATADIR="$(grep "^[[:space:]]*dataDir" "$ZOOCFG" | sed -e 's/.*=//')"
> If i run just that line and point it right at the config file, ZOO_DATADIR is 
> empty.
> If I remove [[:space:]]* from the grep:
> ZOO_DATADIR="$(grep "^dataDir" "$ZOOCFG" | sed -e 's/.*=//')"
> Then it works fine. (If I also make the same change on line 164 and 169)
> My regex skills are pretty bad, so I'm afraid to comment on why [[space]]* 
> needs to be in there?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-1077) C client lib doesn't build on Solaris

2015-04-30 Thread Raul Gutierrez Segales (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14522820#comment-14522820
 ] 

Raul Gutierrez Segales commented on ZOOKEEPER-1077:
---

Thanks [~cnauroth] - could you backport this to the 3.4 branch as well? It 
would be nice to have this for the upcoming 3.4.7 release.

> C client lib doesn't build on Solaris
> -
>
> Key: ZOOKEEPER-1077
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1077
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: build, c client
>Affects Versions: 3.3.4
> Environment: uname -a: SunOS [redacted] 5.10 Generic_142910-17 i86pc 
> i386 i86pc
> GNU toolchain (gcc 3.4.3, GNU Make etc.)
>Reporter: Tadeusz Andrzej Kadłubowski
>Assignee: Chris Nauroth
>Priority: Critical
> Fix For: 3.5.2, 3.6.0
>
> Attachments: ZOOKEEPER-1077.001.patch, zookeeper.patch
>
>
> Hello,
> Some minor trouble with building ZooKeeper C client library on 
> Sun^H^H^HOracle Solaris 5.10.
> 1. You need to link against "-lnsl -lsocket"
> 2. ctime_r needs a buffer size. The signature is: "char *ctime_r(const time_t 
> *clock, char *buf, int buflen)"
> 3. In zk_log.c you need to manually cast pid_t to int (-Werror can be 
> cumbersome ;) )
> 4. getpwuid_r()returns pointer to struct passwd, which works as the last 
> parameter on Linux.
> Solaris signature: struct passwd *getpwuid_r(uid_t  uid,  struct  passwd  
> *pwd, char *buffer, int  buflen); 
> Linux signature: int getpwuid_r(uid_t uid, struct passwd *pwd, char *buf, 
> size_t buflen, struct passwd **result);



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-1077) C client lib doesn't build on Solaris

2015-04-30 Thread Raul Gutierrez Segales (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14522824#comment-14522824
 ] 

Raul Gutierrez Segales commented on ZOOKEEPER-1077:
---

A few comments about the patch:

* "cast the pid to long for use by the snprintf format string." - do we we want 
to do this on all platforms?
* why is:

{noformat}
#define MSG_NOSIGNAL SO_NOSIGPIPE
{noformat}

dropped for Apple?

> C client lib doesn't build on Solaris
> -
>
> Key: ZOOKEEPER-1077
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1077
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: build, c client
>Affects Versions: 3.3.4
> Environment: uname -a: SunOS [redacted] 5.10 Generic_142910-17 i86pc 
> i386 i86pc
> GNU toolchain (gcc 3.4.3, GNU Make etc.)
>Reporter: Tadeusz Andrzej Kadłubowski
>Assignee: Chris Nauroth
>Priority: Critical
> Fix For: 3.5.2, 3.6.0
>
> Attachments: ZOOKEEPER-1077.001.patch, zookeeper.patch
>
>
> Hello,
> Some minor trouble with building ZooKeeper C client library on 
> Sun^H^H^HOracle Solaris 5.10.
> 1. You need to link against "-lnsl -lsocket"
> 2. ctime_r needs a buffer size. The signature is: "char *ctime_r(const time_t 
> *clock, char *buf, int buflen)"
> 3. In zk_log.c you need to manually cast pid_t to int (-Werror can be 
> cumbersome ;) )
> 4. getpwuid_r()returns pointer to struct passwd, which works as the last 
> parameter on Linux.
> Solaris signature: struct passwd *getpwuid_r(uid_t  uid,  struct  passwd  
> *pwd, char *buffer, int  buflen); 
> Linux signature: int getpwuid_r(uid_t uid, struct passwd *pwd, char *buf, 
> size_t buflen, struct passwd **result);



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Reopened] (ZOOKEEPER-1506) Re-try DNS hostname -> IP resolution if node connection fails

2015-05-01 Thread Raul Gutierrez Segales (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raul Gutierrez Segales reopened ZOOKEEPER-1506:
---

Looks like we closed this one by accident, there hasn't been a backport to 3.4 
yet. I'll prepare a patch.

cc: [~michim]

> Re-try DNS hostname -> IP resolution if node connection fails
> -
>
> Key: ZOOKEEPER-1506
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1506
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: server
>Affects Versions: 3.4.5
> Environment: Ubuntu 11.04 64-bit
>Reporter: Mike Heffner
>Assignee: Michi Mutsuzaki
>Priority: Critical
>  Labels: patch
> Fix For: 3.4.7, 3.5.1, 3.6.0
>
> Attachments: ZOOKEEPER-1506-fix.patch, ZOOKEEPER-1506.patch, 
> ZOOKEEPER-1506.patch, ZOOKEEPER-1506.patch, ZOOKEEPER-1506.patch, 
> ZOOKEEPER-1506.patch, ZOOKEEPER-1506.patch, ZOOKEEPER-1506.patch, 
> zk-dns-caching-refresh.patch
>
>
> In our zoo.cfg we use hostnames to identify the ZK servers that are part of 
> an ensemble. These hostnames are configured with a low (<= 60s) TTL and the 
> IP address they map to can and does change. Our procedure for 
> replacing/upgrading a ZK node is to boot an entirely new instance and remap 
> the hostname to the new instance's IP address. Our expectation is that when 
> the original ZK node is terminated/shutdown, the remaining nodes in the 
> ensemble would reconnect to the new instance.
> However, what we are noticing is that the remaining ZK nodes do not attempt 
> to re-resolve the hostname->IP mapping for the new server. Once the original 
> ZK node is terminated, the existing servers continue to attempt contacting it 
> at the old IP address. It would be great if the ZK servers could try to 
> re-resolve the hostname when attempting to connect to a lost ZK server, 
> instead of caching the lookup indefinitely. Currently we must do a rolling 
> restart of the ZK ensemble after swapping a node -- which at three nodes 
> means we periodically lose quorum.
> The exact method we are following is to boot new instances in EC2 and attach 
> one, of a set of three, Elastic IP address. External to EC2 this IP address 
> remains the same and maps to whatever instance it is attached to. Internal to 
> EC2, the elastic IP hostname has a TTL of about 45-60 seconds and is remapped 
> to the internal (10.x.y.z) address of the instance it is attached to. 
> Therefore, in our case we would like ZK to pickup the new 10.x.y.z address 
> that the elastic IP hostname gets mapped to and reconnect appropriately.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-1626) Zookeeper C client should be tolerant of clock adjustments

2015-05-03 Thread Raul Gutierrez Segales (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14525912#comment-14525912
 ] 

Raul Gutierrez Segales commented on ZOOKEEPER-1626:
---

thanks [~cnauroth]! hmm, do we want to backport this to 3.4?

> Zookeeper C client should be tolerant of clock adjustments 
> ---
>
> Key: ZOOKEEPER-1626
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1626
> Project: ZooKeeper
>  Issue Type: Sub-task
>  Components: c client
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
> Fix For: 3.5.1, 3.6.0
>
> Attachments: ZOOKEEPER-1366.001.patch, ZOOKEEPER-1366.002.patch, 
> ZOOKEEPER-1366.003.patch, ZOOKEEPER-1366.004.patch, ZOOKEEPER-1366.006.patch, 
> ZOOKEEPER-1366.007.patch, ZOOKEEPER-1626.patch
>
>
> The Zookeeper C client should use monotonic time when available, in order to 
> be more tolerant of time adjustments.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-2171) avoid reverse lookups in QuorumCnxManager

2015-05-03 Thread Raul Gutierrez Segales (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14525928#comment-14525928
 ] 

Raul Gutierrez Segales commented on ZOOKEEPER-2171:
---

Thanks for the review [~rakeshr]. Though the getHostName calls in the 
LeaderElection recipe are not InetSocketAddress#getHostName calls:

https://github.com/apache/zookeeper/blob/trunk/src/recipes/election/src/java/org/apache/zookeeper/recipes/leader/LeaderOffer.java#L63

Yeah, I'll cleanup the calls in those tests as well. 

> avoid reverse lookups in QuorumCnxManager
> -
>
> Key: ZOOKEEPER-2171
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2171
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: quorum
>Reporter: Raul Gutierrez Segales
>Assignee: Raul Gutierrez Segales
> Fix For: 3.5.1, 3.6.0
>
> Attachments: ZOOKEEPER-2171.patch, ZOOKEEPER-2171.patch, 
> ZOOKEEPER-2171.patch, ZOOKEEPER-2171.patch
>
>
> Apparently, ZOOKEEPER-107 (via a quick git-blame look) introduced a bunch of 
> getHostName() calls in QCM. Besides the overhead, these can cause problems 
> when mixed with failing/mis-configured DNS servers.
> It would be nice to reduce them, if that doesn't affect operational 
> correctness. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (ZOOKEEPER-2171) avoid reverse lookups in QuorumCnxManager

2015-05-03 Thread Raul Gutierrez Segales (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raul Gutierrez Segales updated ZOOKEEPER-2171:
--
Attachment: ZOOKEEPER-2171.patch

Replace the remaining getHostName calls in the tests.

cc: [~michim], [~rakeshr]

> avoid reverse lookups in QuorumCnxManager
> -
>
> Key: ZOOKEEPER-2171
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2171
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: quorum
>Reporter: Raul Gutierrez Segales
>Assignee: Raul Gutierrez Segales
> Fix For: 3.5.1, 3.6.0
>
> Attachments: ZOOKEEPER-2171.patch, ZOOKEEPER-2171.patch, 
> ZOOKEEPER-2171.patch, ZOOKEEPER-2171.patch, ZOOKEEPER-2171.patch
>
>
> Apparently, ZOOKEEPER-107 (via a quick git-blame look) introduced a bunch of 
> getHostName() calls in QCM. Besides the overhead, these can cause problems 
> when mixed with failing/mis-configured DNS servers.
> It would be nice to reduce them, if that doesn't affect operational 
> correctness. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (ZOOKEEPER-1077) C client lib doesn't build on Solaris

2015-05-03 Thread Raul Gutierrez Segales (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1077?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raul Gutierrez Segales updated ZOOKEEPER-1077:
--
Fix Version/s: 3.4.7

> C client lib doesn't build on Solaris
> -
>
> Key: ZOOKEEPER-1077
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1077
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: build, c client
>Affects Versions: 3.3.4
> Environment: uname -a: SunOS [redacted] 5.10 Generic_142910-17 i86pc 
> i386 i86pc
> GNU toolchain (gcc 3.4.3, GNU Make etc.)
>Reporter: Tadeusz Andrzej Kadłubowski
>Assignee: Chris Nauroth
>Priority: Critical
> Fix For: 3.4.7, 3.5.2, 3.6.0
>
> Attachments: ZOOKEEPER-1077-branch-3.4.002.patch, 
> ZOOKEEPER-1077.001.patch, ZOOKEEPER-1077.002.patch, zookeeper.patch
>
>
> Hello,
> Some minor trouble with building ZooKeeper C client library on 
> Sun^H^H^HOracle Solaris 5.10.
> 1. You need to link against "-lnsl -lsocket"
> 2. ctime_r needs a buffer size. The signature is: "char *ctime_r(const time_t 
> *clock, char *buf, int buflen)"
> 3. In zk_log.c you need to manually cast pid_t to int (-Werror can be 
> cumbersome ;) )
> 4. getpwuid_r()returns pointer to struct passwd, which works as the last 
> parameter on Linux.
> Solaris signature: struct passwd *getpwuid_r(uid_t  uid,  struct  passwd  
> *pwd, char *buffer, int  buflen); 
> Linux signature: int getpwuid_r(uid_t uid, struct passwd *pwd, char *buf, 
> size_t buflen, struct passwd **result);



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-1077) C client lib doesn't build on Solaris

2015-05-03 Thread Raul Gutierrez Segales (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14525956#comment-14525956
 ] 

Raul Gutierrez Segales commented on ZOOKEEPER-1077:
---

Thanks for the clarifying - makes sense. 

> C client lib doesn't build on Solaris
> -
>
> Key: ZOOKEEPER-1077
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1077
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: build, c client
>Affects Versions: 3.3.4
> Environment: uname -a: SunOS [redacted] 5.10 Generic_142910-17 i86pc 
> i386 i86pc
> GNU toolchain (gcc 3.4.3, GNU Make etc.)
>Reporter: Tadeusz Andrzej Kadłubowski
>Assignee: Chris Nauroth
>Priority: Critical
> Fix For: 3.4.7, 3.5.2, 3.6.0
>
> Attachments: ZOOKEEPER-1077-branch-3.4.002.patch, 
> ZOOKEEPER-1077.001.patch, ZOOKEEPER-1077.002.patch, zookeeper.patch
>
>
> Hello,
> Some minor trouble with building ZooKeeper C client library on 
> Sun^H^H^HOracle Solaris 5.10.
> 1. You need to link against "-lnsl -lsocket"
> 2. ctime_r needs a buffer size. The signature is: "char *ctime_r(const time_t 
> *clock, char *buf, int buflen)"
> 3. In zk_log.c you need to manually cast pid_t to int (-Werror can be 
> cumbersome ;) )
> 4. getpwuid_r()returns pointer to struct passwd, which works as the last 
> parameter on Linux.
> Solaris signature: struct passwd *getpwuid_r(uid_t  uid,  struct  passwd  
> *pwd, char *buffer, int  buflen); 
> Linux signature: int getpwuid_r(uid_t uid, struct passwd *pwd, char *buf, 
> size_t buflen, struct passwd **result);



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-1077) C client lib doesn't build on Solaris

2015-05-03 Thread Raul Gutierrez Segales (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14525957#comment-14525957
 ] 

Raul Gutierrez Segales commented on ZOOKEEPER-1077:
---

+1, lgtm. 

[~michim], [~rakeshr], [~fpj]: could i get another review/+1 pls? i'll push to 
all branches afterwards. thanks!

> C client lib doesn't build on Solaris
> -
>
> Key: ZOOKEEPER-1077
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1077
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: build, c client
>Affects Versions: 3.3.4
> Environment: uname -a: SunOS [redacted] 5.10 Generic_142910-17 i86pc 
> i386 i86pc
> GNU toolchain (gcc 3.4.3, GNU Make etc.)
>Reporter: Tadeusz Andrzej Kadłubowski
>Assignee: Chris Nauroth
>Priority: Critical
> Fix For: 3.4.7, 3.5.2, 3.6.0
>
> Attachments: ZOOKEEPER-1077-branch-3.4.002.patch, 
> ZOOKEEPER-1077.001.patch, ZOOKEEPER-1077.002.patch, zookeeper.patch
>
>
> Hello,
> Some minor trouble with building ZooKeeper C client library on 
> Sun^H^H^HOracle Solaris 5.10.
> 1. You need to link against "-lnsl -lsocket"
> 2. ctime_r needs a buffer size. The signature is: "char *ctime_r(const time_t 
> *clock, char *buf, int buflen)"
> 3. In zk_log.c you need to manually cast pid_t to int (-Werror can be 
> cumbersome ;) )
> 4. getpwuid_r()returns pointer to struct passwd, which works as the last 
> parameter on Linux.
> Solaris signature: struct passwd *getpwuid_r(uid_t  uid,  struct  passwd  
> *pwd, char *buffer, int  buflen); 
> Linux signature: int getpwuid_r(uid_t uid, struct passwd *pwd, char *buf, 
> size_t buflen, struct passwd **result);



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-2171) avoid reverse lookups in QuorumCnxManager

2015-05-03 Thread Raul Gutierrez Segales (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14526090#comment-14526090
 ] 

Raul Gutierrez Segales commented on ZOOKEEPER-2171:
---

The failure happens in a test that's not related the updates in the latest 
patch (i.e.: it changed some tests, but not 
ReconfigRecoveryTest.testCurrentObserverIsParticipantInNewConfig which is the 
one that failed).

I'll re-kick the process. 

> avoid reverse lookups in QuorumCnxManager
> -
>
> Key: ZOOKEEPER-2171
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2171
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: quorum
>Reporter: Raul Gutierrez Segales
>Assignee: Raul Gutierrez Segales
> Fix For: 3.5.1, 3.6.0
>
> Attachments: ZOOKEEPER-2171.patch, ZOOKEEPER-2171.patch, 
> ZOOKEEPER-2171.patch, ZOOKEEPER-2171.patch, ZOOKEEPER-2171.patch
>
>
> Apparently, ZOOKEEPER-107 (via a quick git-blame look) introduced a bunch of 
> getHostName() calls in QCM. Besides the overhead, these can cause problems 
> when mixed with failing/mis-configured DNS servers.
> It would be nice to reduce them, if that doesn't affect operational 
> correctness. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (ZOOKEEPER-2171) avoid reverse lookups in QuorumCnxManager

2015-05-03 Thread Raul Gutierrez Segales (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raul Gutierrez Segales updated ZOOKEEPER-2171:
--
Attachment: ZOOKEEPER-2171.patch

uploading same patch, to trigger tests

> avoid reverse lookups in QuorumCnxManager
> -
>
> Key: ZOOKEEPER-2171
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2171
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: quorum
>Reporter: Raul Gutierrez Segales
>Assignee: Raul Gutierrez Segales
> Fix For: 3.5.1, 3.6.0
>
> Attachments: ZOOKEEPER-2171.patch, ZOOKEEPER-2171.patch, 
> ZOOKEEPER-2171.patch, ZOOKEEPER-2171.patch, ZOOKEEPER-2171.patch, 
> ZOOKEEPER-2171.patch
>
>
> Apparently, ZOOKEEPER-107 (via a quick git-blame look) introduced a bunch of 
> getHostName() calls in QCM. Besides the overhead, these can cause problems 
> when mixed with failing/mis-configured DNS servers.
> It would be nice to reduce them, if that doesn't affect operational 
> correctness. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-2124) Allow Zookeeper version string to have underscore '_'

2015-05-03 Thread Raul Gutierrez Segales (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14526236#comment-14526236
 ] 

Raul Gutierrez Segales commented on ZOOKEEPER-2124:
---

+1, lgtm.

[~cnauroth]: are you planning on backporting this for 3.4 (so that we can 
include it with 3.4.7)? Thanks!

> Allow Zookeeper version string to have underscore '_'
> -
>
> Key: ZOOKEEPER-2124
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2124
> Project: ZooKeeper
>  Issue Type: Bug
>Affects Versions: 3.4.6
>Reporter: Jerry He
>Assignee: Chris Nauroth
> Fix For: 3.5.1, 3.6.0
>
> Attachments: ZOOKEEPER-2124.001.patch
>
>
> Using Bigtop or other RPM build for Zookeeper, there is a problem with using 
> the hyphen '-' character in the version string:
> {noformat}
> [bigdata@bdvs1166 bigtop]$ gradle zookeeper-rpm
> :buildSrc:compileJava UP-TO-DATE
> :buildSrc:compileGroovy UP-TO-DATE
> :buildSrc:processResources UP-TO-DATE
> :buildSrc:classes UP-TO-DATE
> :buildSrc:jar UP-TO-DATE
> :buildSrc:assemble UP-TO-DATE
> :buildSrc:compileTestJava UP-TO-DATE
> :buildSrc:compileTestGroovy UP-TO-DATE
> :buildSrc:processTestResources UP-TO-DATE
> :buildSrc:testClasses UP-TO-DATE
> :buildSrc:test UP-TO-DATE
> :buildSrc:check UP-TO-DATE
> :buildSrc:build UP-TO-DATE
> :zookeeper_vardefines
> :zookeeper-download
> :zookeeper-tar
> Copy /home/bigdata/bigtop/dl/zookeeper-3.4.6-IBM-1.tar.gz to 
> /home/bigdata/bigtop/build/zookeeper/tar/zookeeper-3.4.6-IBM-1.tar.gz
> :zookeeper-srpm
> error: line 64: Illegal char '-' in: Version: 3.4.6-IBM-1
> :zookeeper-srpm FAILED
> FAILURE: Build failed with an exception.
> * Where:
> Script '/home/bigdata/bigtop/packages.gradle' line: 462
> * What went wrong:
> Execution failed for task ':zookeeper-srpm'.
> > Process 'command 'rpmbuild'' finished with non-zero exit value 1
> * Try:
> Run with --stacktrace option to get the stack trace. Run with --info or 
> --debug option to get more log output.
> BUILD FAILED
> {noformat}
> Also, according to the 
> [rpm-maven-plugin|http://mojo.codehaus.org/rpm-maven-plugin/ident-params.html]
>  documentation:
> {noformat}
> version
> The version number to use for the RPM package. By default, this is the 
> project version. This value cannot contain a dash (-) due to contraints in 
> the RPM file naming convention. Any specified value will be truncated at the 
> first dash
> release
> The release number of the RPM.
> Beginning with release 2.0-beta-2, this is an optional parameter. By default, 
> the release will be generated from the modifier portion of the project 
> version using the following rules:
> If no modifier exists, the release will be 1.
> If the modifier ends with SNAPSHOT, the timestamp (in UTC) of the build will 
> be appended to end.
> All instances of '-' in the modifier will be replaced with '_'.
> If a modifier exists and does not end with SNAPSHOT, "_1" will be appended to 
> end.
> {noformat}
> We should allow underscore '_' as part of the version string. e.g. 
> 3.4.6_abc_1



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Reopened] (ZOOKEEPER-2080) ReconfigRecoveryTest fails intermittently

2015-05-03 Thread Raul Gutierrez Segales (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raul Gutierrez Segales reopened ZOOKEEPER-2080:
---
  Assignee: Raul Gutierrez Segales

This just failed for me:

https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/2671/testReport/

[~shralex]: maybe you have some ideas?

> ReconfigRecoveryTest fails intermittently
> -
>
> Key: ZOOKEEPER-2080
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2080
> Project: ZooKeeper
>  Issue Type: Sub-task
>Reporter: Ted Yu
>Assignee: Raul Gutierrez Segales
>Priority: Minor
>
> I got the following test failure on MacBook with trunk code:
> {code}
> Testcase: testCurrentObserverIsParticipantInNewConfig took 93.628 sec
>   FAILED
> waiting for server 2 being up
> junit.framework.AssertionFailedError: waiting for server 2 being up
>   at 
> org.apache.zookeeper.server.quorum.ReconfigRecoveryTest.testCurrentObserverIsParticipantInNewConfig(ReconfigRecoveryTest.java:529)
>   at 
> org.apache.zookeeper.JUnit4ZKTestRunner$LoggedInvokeMethod.evaluate(JUnit4ZKTestRunner.java:52)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-2080) ReconfigRecoveryTest fails intermittently

2015-05-03 Thread Raul Gutierrez Segales (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14526245#comment-14526245
 ] 

Raul Gutierrez Segales commented on ZOOKEEPER-2080:
---

It passes pretty consistently for me locally:

{noformat}
~/src/zookeeper-svn (ZOOKEEPER-2171) ✔ ant -Dtestcase=ReconfigRecoveryTest 
test-core-java^


BUILD SUCCESSFUL
Total time: 3 minutes 28 seconds
{noformat}

So I am tempted to propose a patch that raises the wait timeout for 'waiting 
for server 2 being up' and call it a day, unless someone has an proposal to 
refactor this test and make it more robust all together. Thoughts?

> ReconfigRecoveryTest fails intermittently
> -
>
> Key: ZOOKEEPER-2080
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2080
> Project: ZooKeeper
>  Issue Type: Sub-task
>Reporter: Ted Yu
>Assignee: Raul Gutierrez Segales
>Priority: Minor
>
> I got the following test failure on MacBook with trunk code:
> {code}
> Testcase: testCurrentObserverIsParticipantInNewConfig took 93.628 sec
>   FAILED
> waiting for server 2 being up
> junit.framework.AssertionFailedError: waiting for server 2 being up
>   at 
> org.apache.zookeeper.server.quorum.ReconfigRecoveryTest.testCurrentObserverIsParticipantInNewConfig(ReconfigRecoveryTest.java:529)
>   at 
> org.apache.zookeeper.JUnit4ZKTestRunner$LoggedInvokeMethod.evaluate(JUnit4ZKTestRunner.java:52)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (ZOOKEEPER-2171) avoid reverse lookups in QuorumCnxManager

2015-05-03 Thread Raul Gutierrez Segales (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raul Gutierrez Segales updated ZOOKEEPER-2171:
--
Attachment: (was: ZOOKEEPER-2171.patch)

> avoid reverse lookups in QuorumCnxManager
> -
>
> Key: ZOOKEEPER-2171
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2171
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: quorum
>Reporter: Raul Gutierrez Segales
>Assignee: Raul Gutierrez Segales
> Fix For: 3.5.1, 3.6.0
>
> Attachments: ZOOKEEPER-2171.patch
>
>
> Apparently, ZOOKEEPER-107 (via a quick git-blame look) introduced a bunch of 
> getHostName() calls in QCM. Besides the overhead, these can cause problems 
> when mixed with failing/mis-configured DNS servers.
> It would be nice to reduce them, if that doesn't affect operational 
> correctness. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (ZOOKEEPER-2171) avoid reverse lookups in QuorumCnxManager

2015-05-03 Thread Raul Gutierrez Segales (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raul Gutierrez Segales updated ZOOKEEPER-2171:
--
Attachment: (was: ZOOKEEPER-2171.patch)

> avoid reverse lookups in QuorumCnxManager
> -
>
> Key: ZOOKEEPER-2171
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2171
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: quorum
>Reporter: Raul Gutierrez Segales
>Assignee: Raul Gutierrez Segales
> Fix For: 3.5.1, 3.6.0
>
> Attachments: ZOOKEEPER-2171.patch
>
>
> Apparently, ZOOKEEPER-107 (via a quick git-blame look) introduced a bunch of 
> getHostName() calls in QCM. Besides the overhead, these can cause problems 
> when mixed with failing/mis-configured DNS servers.
> It would be nice to reduce them, if that doesn't affect operational 
> correctness. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (ZOOKEEPER-2171) avoid reverse lookups in QuorumCnxManager

2015-05-03 Thread Raul Gutierrez Segales (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raul Gutierrez Segales updated ZOOKEEPER-2171:
--
Attachment: (was: ZOOKEEPER-2171.patch)

> avoid reverse lookups in QuorumCnxManager
> -
>
> Key: ZOOKEEPER-2171
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2171
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: quorum
>Reporter: Raul Gutierrez Segales
>Assignee: Raul Gutierrez Segales
> Fix For: 3.5.1, 3.6.0
>
> Attachments: ZOOKEEPER-2171.patch
>
>
> Apparently, ZOOKEEPER-107 (via a quick git-blame look) introduced a bunch of 
> getHostName() calls in QCM. Besides the overhead, these can cause problems 
> when mixed with failing/mis-configured DNS servers.
> It would be nice to reduce them, if that doesn't affect operational 
> correctness. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (ZOOKEEPER-2171) avoid reverse lookups in QuorumCnxManager

2015-05-03 Thread Raul Gutierrez Segales (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raul Gutierrez Segales updated ZOOKEEPER-2171:
--
Attachment: (was: ZOOKEEPER-2171.patch)

> avoid reverse lookups in QuorumCnxManager
> -
>
> Key: ZOOKEEPER-2171
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2171
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: quorum
>Reporter: Raul Gutierrez Segales
>Assignee: Raul Gutierrez Segales
> Fix For: 3.5.1, 3.6.0
>
> Attachments: ZOOKEEPER-2171.patch
>
>
> Apparently, ZOOKEEPER-107 (via a quick git-blame look) introduced a bunch of 
> getHostName() calls in QCM. Besides the overhead, these can cause problems 
> when mixed with failing/mis-configured DNS servers.
> It would be nice to reduce them, if that doesn't affect operational 
> correctness. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (ZOOKEEPER-2171) avoid reverse lookups in QuorumCnxManager

2015-05-03 Thread Raul Gutierrez Segales (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raul Gutierrez Segales updated ZOOKEEPER-2171:
--
Attachment: (was: ZOOKEEPER-2171.patch)

> avoid reverse lookups in QuorumCnxManager
> -
>
> Key: ZOOKEEPER-2171
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2171
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: quorum
>Reporter: Raul Gutierrez Segales
>Assignee: Raul Gutierrez Segales
> Fix For: 3.5.1, 3.6.0
>
> Attachments: ZOOKEEPER-2171.patch
>
>
> Apparently, ZOOKEEPER-107 (via a quick git-blame look) introduced a bunch of 
> getHostName() calls in QCM. Besides the overhead, these can cause problems 
> when mixed with failing/mis-configured DNS servers.
> It would be nice to reduce them, if that doesn't affect operational 
> correctness. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (ZOOKEEPER-2171) avoid reverse lookups in QuorumCnxManager

2015-05-03 Thread Raul Gutierrez Segales (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raul Gutierrez Segales updated ZOOKEEPER-2171:
--
Attachment: ZOOKEEPER-2171.patch

another try given that 
ReconfigRecoveryTest.testCurrentObserverIsParticipantInNewConfig fails in CI, 
but passes locally for me. 

> avoid reverse lookups in QuorumCnxManager
> -
>
> Key: ZOOKEEPER-2171
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2171
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: quorum
>Reporter: Raul Gutierrez Segales
>Assignee: Raul Gutierrez Segales
> Fix For: 3.5.1, 3.6.0
>
> Attachments: ZOOKEEPER-2171.patch, ZOOKEEPER-2171.patch
>
>
> Apparently, ZOOKEEPER-107 (via a quick git-blame look) introduced a bunch of 
> getHostName() calls in QCM. Besides the overhead, these can cause problems 
> when mixed with failing/mis-configured DNS servers.
> It would be nice to reduce them, if that doesn't affect operational 
> correctness. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-2080) ReconfigRecoveryTest fails intermittently

2015-05-03 Thread Raul Gutierrez Segales (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14526266#comment-14526266
 ] 

Raul Gutierrez Segales commented on ZOOKEEPER-2080:
---

Hmmm, interesting. Maybe there is a straggler thread generating those. I'll 
compare with a successful build. 

> ReconfigRecoveryTest fails intermittently
> -
>
> Key: ZOOKEEPER-2080
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2080
> Project: ZooKeeper
>  Issue Type: Sub-task
>Reporter: Ted Yu
>Assignee: Raul Gutierrez Segales
>Priority: Minor
>
> I got the following test failure on MacBook with trunk code:
> {code}
> Testcase: testCurrentObserverIsParticipantInNewConfig took 93.628 sec
>   FAILED
> waiting for server 2 being up
> junit.framework.AssertionFailedError: waiting for server 2 being up
>   at 
> org.apache.zookeeper.server.quorum.ReconfigRecoveryTest.testCurrentObserverIsParticipantInNewConfig(ReconfigRecoveryTest.java:529)
>   at 
> org.apache.zookeeper.JUnit4ZKTestRunner$LoggedInvokeMethod.evaluate(JUnit4ZKTestRunner.java:52)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-2080) ReconfigRecoveryTest fails intermittently

2015-05-03 Thread Raul Gutierrez Segales (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14526269#comment-14526269
 ] 

Raul Gutierrez Segales commented on ZOOKEEPER-2080:
---

Actually, it's expected since we call waitForServerUp on almost all tests, and 
that calls the 'stat' 4-letter cmd:

https://github.com/apache/zookeeper/blob/trunk/src/java/test/org/apache/zookeeper/test/ClientBase.java#L242



> ReconfigRecoveryTest fails intermittently
> -
>
> Key: ZOOKEEPER-2080
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2080
> Project: ZooKeeper
>  Issue Type: Sub-task
>Reporter: Ted Yu
>Assignee: Raul Gutierrez Segales
>Priority: Minor
>
> I got the following test failure on MacBook with trunk code:
> {code}
> Testcase: testCurrentObserverIsParticipantInNewConfig took 93.628 sec
>   FAILED
> waiting for server 2 being up
> junit.framework.AssertionFailedError: waiting for server 2 being up
>   at 
> org.apache.zookeeper.server.quorum.ReconfigRecoveryTest.testCurrentObserverIsParticipantInNewConfig(ReconfigRecoveryTest.java:529)
>   at 
> org.apache.zookeeper.JUnit4ZKTestRunner$LoggedInvokeMethod.evaluate(JUnit4ZKTestRunner.java:52)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-2124) Allow Zookeeper version string to have underscore '_'

2015-05-03 Thread Raul Gutierrez Segales (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14526311#comment-14526311
 ] 

Raul Gutierrez Segales commented on ZOOKEEPER-2124:
---

Merged:

trunk: http://svn.apache.org/viewvc?view=revision&revision=1677529
branch-3.5: http://svn.apache.org/viewvc?view=revision&revision=1677530

Leaving open until we get a patch for 3.4. Thanks [~cnauroth]!


> Allow Zookeeper version string to have underscore '_'
> -
>
> Key: ZOOKEEPER-2124
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2124
> Project: ZooKeeper
>  Issue Type: Bug
>Affects Versions: 3.4.6
>Reporter: Jerry He
>Assignee: Chris Nauroth
> Fix For: 3.5.1, 3.6.0
>
> Attachments: ZOOKEEPER-2124.001.patch
>
>
> Using Bigtop or other RPM build for Zookeeper, there is a problem with using 
> the hyphen '-' character in the version string:
> {noformat}
> [bigdata@bdvs1166 bigtop]$ gradle zookeeper-rpm
> :buildSrc:compileJava UP-TO-DATE
> :buildSrc:compileGroovy UP-TO-DATE
> :buildSrc:processResources UP-TO-DATE
> :buildSrc:classes UP-TO-DATE
> :buildSrc:jar UP-TO-DATE
> :buildSrc:assemble UP-TO-DATE
> :buildSrc:compileTestJava UP-TO-DATE
> :buildSrc:compileTestGroovy UP-TO-DATE
> :buildSrc:processTestResources UP-TO-DATE
> :buildSrc:testClasses UP-TO-DATE
> :buildSrc:test UP-TO-DATE
> :buildSrc:check UP-TO-DATE
> :buildSrc:build UP-TO-DATE
> :zookeeper_vardefines
> :zookeeper-download
> :zookeeper-tar
> Copy /home/bigdata/bigtop/dl/zookeeper-3.4.6-IBM-1.tar.gz to 
> /home/bigdata/bigtop/build/zookeeper/tar/zookeeper-3.4.6-IBM-1.tar.gz
> :zookeeper-srpm
> error: line 64: Illegal char '-' in: Version: 3.4.6-IBM-1
> :zookeeper-srpm FAILED
> FAILURE: Build failed with an exception.
> * Where:
> Script '/home/bigdata/bigtop/packages.gradle' line: 462
> * What went wrong:
> Execution failed for task ':zookeeper-srpm'.
> > Process 'command 'rpmbuild'' finished with non-zero exit value 1
> * Try:
> Run with --stacktrace option to get the stack trace. Run with --info or 
> --debug option to get more log output.
> BUILD FAILED
> {noformat}
> Also, according to the 
> [rpm-maven-plugin|http://mojo.codehaus.org/rpm-maven-plugin/ident-params.html]
>  documentation:
> {noformat}
> version
> The version number to use for the RPM package. By default, this is the 
> project version. This value cannot contain a dash (-) due to contraints in 
> the RPM file naming convention. Any specified value will be truncated at the 
> first dash
> release
> The release number of the RPM.
> Beginning with release 2.0-beta-2, this is an optional parameter. By default, 
> the release will be generated from the modifier portion of the project 
> version using the following rules:
> If no modifier exists, the release will be 1.
> If the modifier ends with SNAPSHOT, the timestamp (in UTC) of the build will 
> be appended to end.
> All instances of '-' in the modifier will be replaced with '_'.
> If a modifier exists and does not end with SNAPSHOT, "_1" will be appended to 
> end.
> {noformat}
> We should allow underscore '_' as part of the version string. e.g. 
> 3.4.6_abc_1



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-2163) Introduce new ZNode type: container

2015-05-07 Thread Raul Gutierrez Segales (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14532847#comment-14532847
 ] 

Raul Gutierrez Segales commented on ZOOKEEPER-2163:
---

If this would work, it reduce a lot of boilerplate and it actually makes sense. 
Great idea [~rakeshr]. 

> Introduce new ZNode type: container
> ---
>
> Key: ZOOKEEPER-2163
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2163
> Project: ZooKeeper
>  Issue Type: New Feature
>  Components: c client, java client, server
>Affects Versions: 3.5.0
>Reporter: Jordan Zimmerman
>Assignee: Jordan Zimmerman
> Fix For: 3.6.0
>
> Attachments: zookeeper-2163.3.patch
>
>
> BACKGROUND
> 
> A recurring problem for ZooKeeper users is garbage collection of parent 
> nodes. Many recipes (e.g. locks, leaders, etc.) call for the creation of a 
> parent node under which participants create sequential nodes. When the 
> participant is done, it deletes its node. In practice, the ZooKeeper tree 
> begins to fill up with orphaned parent nodes that are no longer needed. The 
> ZooKeeper APIs don’t provide a way to clean these. Over time, ZooKeeper can 
> become unstable due to the number of these nodes.
> CURRENT SOLUTIONS
> ===
> Apache Curator has a workaround solution for this by providing the Reaper 
> class which runs in the background looking for orphaned parent nodes and 
> deleting them. This isn’t ideal and it would be better if ZooKeeper supported 
> this directly.
> PROPOSAL
> =
> ZOOKEEPER-723 and ZOOKEEPER-834 have been proposed to allow EPHEMERAL nodes 
> to contain child nodes. This is not optimum as EPHEMERALs are tied to a 
> session and the general use case of parent nodes is for PERSISTENT nodes. 
> This proposal adds a new node type, CONTAINER. A CONTAINER node is the same 
> as a PERSISTENT node with the additional property that when its last child is 
> deleted, it is deleted (and CONTAINER nodes recursively up the tree are 
> deleted if empty).
> CANONICAL USAGE
> 
> {code}
> while ( true) { // or some reasonable limit
> try {
> zk.create(path, ...);
> break;
> } catch ( KeeperException.NoNodeException e ) {
> try {
> zk.createContainer(containerPath, ...);
> } catch ( KeeperException.NodeExistsException ignore) {
>}
> }
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-2163) Introduce new ZNode type: container

2015-05-07 Thread Raul Gutierrez Segales (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14533759#comment-14533759
 ] 

Raul Gutierrez Segales commented on ZOOKEEPER-2163:
---

Presumably, we don't need the CreateContainerRequest definition in 
zookeeper.jute anymore? 

> Introduce new ZNode type: container
> ---
>
> Key: ZOOKEEPER-2163
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2163
> Project: ZooKeeper
>  Issue Type: New Feature
>  Components: c client, java client, server
>Affects Versions: 3.5.0
>Reporter: Jordan Zimmerman
>Assignee: Jordan Zimmerman
> Fix For: 3.6.0
>
> Attachments: zookeeper-2163.3.patch, zookeeper-2163.5.patch
>
>
> BACKGROUND
> 
> A recurring problem for ZooKeeper users is garbage collection of parent 
> nodes. Many recipes (e.g. locks, leaders, etc.) call for the creation of a 
> parent node under which participants create sequential nodes. When the 
> participant is done, it deletes its node. In practice, the ZooKeeper tree 
> begins to fill up with orphaned parent nodes that are no longer needed. The 
> ZooKeeper APIs don’t provide a way to clean these. Over time, ZooKeeper can 
> become unstable due to the number of these nodes.
> CURRENT SOLUTIONS
> ===
> Apache Curator has a workaround solution for this by providing the Reaper 
> class which runs in the background looking for orphaned parent nodes and 
> deleting them. This isn’t ideal and it would be better if ZooKeeper supported 
> this directly.
> PROPOSAL
> =
> ZOOKEEPER-723 and ZOOKEEPER-834 have been proposed to allow EPHEMERAL nodes 
> to contain child nodes. This is not optimum as EPHEMERALs are tied to a 
> session and the general use case of parent nodes is for PERSISTENT nodes. 
> This proposal adds a new node type, CONTAINER. A CONTAINER node is the same 
> as a PERSISTENT node with the additional property that when its last child is 
> deleted, it is deleted (and CONTAINER nodes recursively up the tree are 
> deleted if empty).
> CANONICAL USAGE
> 
> {code}
> while ( true) { // or some reasonable limit
> try {
> zk.create(path, ...);
> break;
> } catch ( KeeperException.NoNodeException e ) {
> try {
> zk.createContainer(containerPath, ...);
> } catch ( KeeperException.NodeExistsException ignore) {
>}
> }
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-602) log all exceptions not caught by ZK threads

2015-05-08 Thread Raul Gutierrez Segales (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14535021#comment-14535021
 ] 

Raul Gutierrez Segales commented on ZOOKEEPER-602:
--

I am, initially, -1 on the backport. It's too big and it's a distraction to not 
keep pushing forward with making 3.5.0 stable enough. More thoughts would be 
nice though :-)

> log all exceptions not caught by ZK threads
> ---
>
> Key: ZOOKEEPER-602
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-602
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: java client, server
>Affects Versions: 3.2.1
>Reporter: Patrick Hunt
>Assignee: Rakesh R
>Priority: Critical
> Fix For: 3.5.0
>
> Attachments: ZOOKEEPER-602.patch, ZOOKEEPER-602.patch, 
> ZOOKEEPER-602.patch, ZOOKEEPER-602.patch, ZOOKEEPER-602.patch, 
> ZOOKEEPER-602.patch, ZOOKEEPER-602.patch
>
>
> the java code should add a ThreadGroup exception handler that logs at ERROR 
> level any uncaught exceptions thrown by Thread run methods.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (ZOOKEEPER-2186) QuorumCnxManager#receiveConnection

2015-05-08 Thread Raul Gutierrez Segales (JIRA)
Raul Gutierrez Segales created ZOOKEEPER-2186:
-

 Summary: QuorumCnxManager#receiveConnection
 Key: ZOOKEEPER-2186
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2186
 Project: ZooKeeper
  Issue Type: Bug
  Components: server
Reporter: Raul Gutierrez Segales
Assignee: Raul Gutierrez Segales
 Fix For: 3.4.7, 3.5.1, 3.6.0


This will allocate an arbitrarily large byte buffer (and try to read it!):

{code}
public boolean receiveConnection(Socket sock) {
Long sid = null;
...
sid = din.readLong();
// next comes the #bytes in the remainder of the message
 
int num_remaining_bytes = din.readInt();
byte[] b = new byte[num_remaining_bytes];
// remove the remainder of the message from din 
 
int num_read = din.read(b);
{code}

This will crash the QuorumCnxManager thread, so the cluster will keep going but 
future elections might fail to converge (ditto for leaving/joining members). 

Patch coming up in a bit.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (ZOOKEEPER-2186) QuorumCnxManager#receiveConnection

2015-05-08 Thread Raul Gutierrez Segales (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raul Gutierrez Segales updated ZOOKEEPER-2186:
--
Affects Version/s: 3.4.6
   3.5.0

> QuorumCnxManager#receiveConnection
> --
>
> Key: ZOOKEEPER-2186
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2186
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.4.6, 3.5.0
>Reporter: Raul Gutierrez Segales
>Assignee: Raul Gutierrez Segales
> Fix For: 3.4.7, 3.5.1, 3.6.0
>
>
> This will allocate an arbitrarily large byte buffer (and try to read it!):
> {code}
> public boolean receiveConnection(Socket sock) {
> Long sid = null;
> ...
> sid = din.readLong();
> // next comes the #bytes in the remainder of the message  
>
> int num_remaining_bytes = din.readInt();
> byte[] b = new byte[num_remaining_bytes];
> // remove the remainder of the message from din   
>
> int num_read = din.read(b);
> {code}
> This will crash the QuorumCnxManager thread, so the cluster will keep going 
> but future elections might fail to converge (ditto for leaving/joining 
> members). 
> Patch coming up in a bit.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (ZOOKEEPER-2186) QuorumCnxManager#receiveConnection may crash with random input

2015-05-08 Thread Raul Gutierrez Segales (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raul Gutierrez Segales updated ZOOKEEPER-2186:
--
Summary: QuorumCnxManager#receiveConnection may crash with random input  
(was: QuorumCnxManager#receiveConnection)

> QuorumCnxManager#receiveConnection may crash with random input
> --
>
> Key: ZOOKEEPER-2186
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2186
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.4.6, 3.5.0
>Reporter: Raul Gutierrez Segales
>Assignee: Raul Gutierrez Segales
> Fix For: 3.4.7, 3.5.1, 3.6.0
>
>
> This will allocate an arbitrarily large byte buffer (and try to read it!):
> {code}
> public boolean receiveConnection(Socket sock) {
> Long sid = null;
> ...
> sid = din.readLong();
> // next comes the #bytes in the remainder of the message  
>
> int num_remaining_bytes = din.readInt();
> byte[] b = new byte[num_remaining_bytes];
> // remove the remainder of the message from din   
>
> int num_read = din.read(b);
> {code}
> This will crash the QuorumCnxManager thread, so the cluster will keep going 
> but future elections might fail to converge (ditto for leaving/joining 
> members). 
> Patch coming up in a bit.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (ZOOKEEPER-2187) remove duplicated code between CreateRequest{,2}

2015-05-08 Thread Raul Gutierrez Segales (JIRA)
Raul Gutierrez Segales created ZOOKEEPER-2187:
-

 Summary: remove duplicated code between CreateRequest{,2}
 Key: ZOOKEEPER-2187
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2187
 Project: ZooKeeper
  Issue Type: Bug
  Components: server
Reporter: Raul Gutierrez Segales
Assignee: Raul Gutierrez Segales
Priority: Minor
 Fix For: 3.5.2, 3.6.0


To avoid cargo culting and reducing duplicated code we can merge most of 
CreateRequest & CreateRequest2 given that only the Response object is actually 
different.

This will improve readability of the code plus make it less confusing for 
people adding new opcodes in the future (i.e.: copying a request definition vs 
reusing what's already there, etc.). 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (ZOOKEEPER-2187) remove duplicated code between CreateRequest{,2}

2015-05-08 Thread Raul Gutierrez Segales (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raul Gutierrez Segales updated ZOOKEEPER-2187:
--
Attachment: ZOOKEEPER-2187.patch

cc: [~hdeng], [~rakeshr], [~randgalt]

> remove duplicated code between CreateRequest{,2}
> 
>
> Key: ZOOKEEPER-2187
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2187
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Reporter: Raul Gutierrez Segales
>Assignee: Raul Gutierrez Segales
>Priority: Minor
> Fix For: 3.5.2, 3.6.0
>
> Attachments: ZOOKEEPER-2187.patch
>
>
> To avoid cargo culting and reducing duplicated code we can merge most of 
> CreateRequest & CreateRequest2 given that only the Response object is 
> actually different.
> This will improve readability of the code plus make it less confusing for 
> people adding new opcodes in the future (i.e.: copying a request definition 
> vs reusing what's already there, etc.). 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (ZOOKEEPER-2186) QuorumCnxManager#receiveConnection may crash with random input

2015-05-08 Thread Raul Gutierrez Segales (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raul Gutierrez Segales updated ZOOKEEPER-2186:
--
Attachment: ZOOKEEPER-2186.patch

cc: [~hdeng], [~phunt]

> QuorumCnxManager#receiveConnection may crash with random input
> --
>
> Key: ZOOKEEPER-2186
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2186
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.4.6, 3.5.0
>Reporter: Raul Gutierrez Segales
>Assignee: Raul Gutierrez Segales
> Fix For: 3.4.7, 3.5.1, 3.6.0
>
> Attachments: ZOOKEEPER-2186.patch
>
>
> This will allocate an arbitrarily large byte buffer (and try to read it!):
> {code}
> public boolean receiveConnection(Socket sock) {
> Long sid = null;
> ...
> sid = din.readLong();
> // next comes the #bytes in the remainder of the message  
>
> int num_remaining_bytes = din.readInt();
> byte[] b = new byte[num_remaining_bytes];
> // remove the remainder of the message from din   
>
> int num_read = din.read(b);
> {code}
> This will crash the QuorumCnxManager thread, so the cluster will keep going 
> but future elections might fail to converge (ditto for leaving/joining 
> members). 
> Patch coming up in a bit.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-2186) QuorumCnxManager#receiveConnection may crash with random input

2015-05-09 Thread Raul Gutierrez Segales (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14536966#comment-14536966
 ] 

Raul Gutierrez Segales commented on ZOOKEEPER-2186:
---

[~michim]: yes, I'd say so. 

> QuorumCnxManager#receiveConnection may crash with random input
> --
>
> Key: ZOOKEEPER-2186
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2186
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.4.6, 3.5.0
>Reporter: Raul Gutierrez Segales
>Assignee: Raul Gutierrez Segales
> Fix For: 3.4.7, 3.5.1, 3.6.0
>
> Attachments: ZOOKEEPER-2186.patch
>
>
> This will allocate an arbitrarily large byte buffer (and try to read it!):
> {code}
> public boolean receiveConnection(Socket sock) {
> Long sid = null;
> ...
> sid = din.readLong();
> // next comes the #bytes in the remainder of the message  
>
> int num_remaining_bytes = din.readInt();
> byte[] b = new byte[num_remaining_bytes];
> // remove the remainder of the message from din   
>
> int num_read = din.read(b);
> {code}
> This will crash the QuorumCnxManager thread, so the cluster will keep going 
> but future elections might fail to converge (ditto for leaving/joining 
> members). 
> Patch coming up in a bit.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (ZOOKEEPER-2186) QuorumCnxManager#receiveConnection may crash with random input

2015-05-10 Thread Raul Gutierrez Segales (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raul Gutierrez Segales updated ZOOKEEPER-2186:
--
Attachment: ZOOKEEPER-2186.patch

Handle old serves (i.e.: which don't send the protocolVersion but instead just 
send their sid).



> QuorumCnxManager#receiveConnection may crash with random input
> --
>
> Key: ZOOKEEPER-2186
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2186
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.4.6, 3.5.0
>Reporter: Raul Gutierrez Segales
>Assignee: Raul Gutierrez Segales
> Fix For: 3.4.7, 3.5.1, 3.6.0
>
> Attachments: ZOOKEEPER-2186.patch, ZOOKEEPER-2186.patch
>
>
> This will allocate an arbitrarily large byte buffer (and try to read it!):
> {code}
> public boolean receiveConnection(Socket sock) {
> Long sid = null;
> ...
> sid = din.readLong();
> // next comes the #bytes in the remainder of the message  
>
> int num_remaining_bytes = din.readInt();
> byte[] b = new byte[num_remaining_bytes];
> // remove the remainder of the message from din   
>
> int num_read = din.read(b);
> {code}
> This will crash the QuorumCnxManager thread, so the cluster will keep going 
> but future elections might fail to converge (ditto for leaving/joining 
> members). 
> Patch coming up in a bit.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (ZOOKEEPER-2186) QuorumCnxManager#receiveConnection may crash with random input

2015-05-10 Thread Raul Gutierrez Segales (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raul Gutierrez Segales updated ZOOKEEPER-2186:
--
Attachment: (was: ZOOKEEPER-2186.patch)

> QuorumCnxManager#receiveConnection may crash with random input
> --
>
> Key: ZOOKEEPER-2186
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2186
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.4.6, 3.5.0
>Reporter: Raul Gutierrez Segales
>Assignee: Raul Gutierrez Segales
> Fix For: 3.4.7, 3.5.1, 3.6.0
>
>
> This will allocate an arbitrarily large byte buffer (and try to read it!):
> {code}
> public boolean receiveConnection(Socket sock) {
> Long sid = null;
> ...
> sid = din.readLong();
> // next comes the #bytes in the remainder of the message  
>
> int num_remaining_bytes = din.readInt();
> byte[] b = new byte[num_remaining_bytes];
> // remove the remainder of the message from din   
>
> int num_read = din.read(b);
> {code}
> This will crash the QuorumCnxManager thread, so the cluster will keep going 
> but future elections might fail to converge (ditto for leaving/joining 
> members). 
> Patch coming up in a bit.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (ZOOKEEPER-2186) QuorumCnxManager#receiveConnection may crash with random input

2015-05-10 Thread Raul Gutierrez Segales (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raul Gutierrez Segales updated ZOOKEEPER-2186:
--
Attachment: (was: ZOOKEEPER-2186.patch)

> QuorumCnxManager#receiveConnection may crash with random input
> --
>
> Key: ZOOKEEPER-2186
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2186
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.4.6, 3.5.0
>Reporter: Raul Gutierrez Segales
>Assignee: Raul Gutierrez Segales
> Fix For: 3.4.7, 3.5.1, 3.6.0
>
>
> This will allocate an arbitrarily large byte buffer (and try to read it!):
> {code}
> public boolean receiveConnection(Socket sock) {
> Long sid = null;
> ...
> sid = din.readLong();
> // next comes the #bytes in the remainder of the message  
>
> int num_remaining_bytes = din.readInt();
> byte[] b = new byte[num_remaining_bytes];
> // remove the remainder of the message from din   
>
> int num_read = din.read(b);
> {code}
> This will crash the QuorumCnxManager thread, so the cluster will keep going 
> but future elections might fail to converge (ditto for leaving/joining 
> members). 
> Patch coming up in a bit.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (ZOOKEEPER-2186) QuorumCnxManager#receiveConnection may crash with random input

2015-05-10 Thread Raul Gutierrez Segales (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raul Gutierrez Segales updated ZOOKEEPER-2186:
--
Attachment: ZOOKEEPER-2186.patch

[~michim]: i think this should be much cleaner, I encapsulated the parsing code 
into the InitialMessage class so we can easily test that independently of 
QuorumCnxManager socket handling stuff (and, opening fewer sockets in our tests 
is usually a good thing!).

I'll add the tests tomorrow.

cc: [~shralex] since this somehow related to reconfig. 

> QuorumCnxManager#receiveConnection may crash with random input
> --
>
> Key: ZOOKEEPER-2186
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2186
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.4.6, 3.5.0
>Reporter: Raul Gutierrez Segales
>Assignee: Raul Gutierrez Segales
> Fix For: 3.4.7, 3.5.1, 3.6.0
>
> Attachments: ZOOKEEPER-2186.patch
>
>
> This will allocate an arbitrarily large byte buffer (and try to read it!):
> {code}
> public boolean receiveConnection(Socket sock) {
> Long sid = null;
> ...
> sid = din.readLong();
> // next comes the #bytes in the remainder of the message  
>
> int num_remaining_bytes = din.readInt();
> byte[] b = new byte[num_remaining_bytes];
> // remove the remainder of the message from din   
>
> int num_read = din.read(b);
> {code}
> This will crash the QuorumCnxManager thread, so the cluster will keep going 
> but future elections might fail to converge (ditto for leaving/joining 
> members). 
> Patch coming up in a bit.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-2186) QuorumCnxManager#receiveConnection may crash with random input

2015-05-11 Thread Raul Gutierrez Segales (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14538301#comment-14538301
 ] 

Raul Gutierrez Segales commented on ZOOKEEPER-2186:
---

[~hdeng]: git-review is broken for me today, mind reviewing in the PR:

https://github.com/apache/zookeeper/pull/30

? Thanks!

> QuorumCnxManager#receiveConnection may crash with random input
> --
>
> Key: ZOOKEEPER-2186
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2186
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.4.6, 3.5.0
>Reporter: Raul Gutierrez Segales
>Assignee: Raul Gutierrez Segales
> Fix For: 3.4.7, 3.5.1, 3.6.0
>
> Attachments: ZOOKEEPER-2186.patch
>
>
> This will allocate an arbitrarily large byte buffer (and try to read it!):
> {code}
> public boolean receiveConnection(Socket sock) {
> Long sid = null;
> ...
> sid = din.readLong();
> // next comes the #bytes in the remainder of the message  
>
> int num_remaining_bytes = din.readInt();
> byte[] b = new byte[num_remaining_bytes];
> // remove the remainder of the message from din   
>
> int num_read = din.read(b);
> {code}
> This will crash the QuorumCnxManager thread, so the cluster will keep going 
> but future elections might fail to converge (ditto for leaving/joining 
> members). 
> Patch coming up in a bit.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (ZOOKEEPER-2186) QuorumCnxManager#receiveConnection may crash with random input

2015-05-13 Thread Raul Gutierrez Segales (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raul Gutierrez Segales updated ZOOKEEPER-2186:
--
Attachment: ZOOKEEPER-2186.patch

add tests & fix maxBuffer constant.

cc: [~hdeng], [~michim], [~shralex]

> QuorumCnxManager#receiveConnection may crash with random input
> --
>
> Key: ZOOKEEPER-2186
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2186
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.4.6, 3.5.0
>Reporter: Raul Gutierrez Segales
>Assignee: Raul Gutierrez Segales
> Fix For: 3.4.7, 3.5.1, 3.6.0
>
> Attachments: ZOOKEEPER-2186.patch, ZOOKEEPER-2186.patch
>
>
> This will allocate an arbitrarily large byte buffer (and try to read it!):
> {code}
> public boolean receiveConnection(Socket sock) {
> Long sid = null;
> ...
> sid = din.readLong();
> // next comes the #bytes in the remainder of the message  
>
> int num_remaining_bytes = din.readInt();
> byte[] b = new byte[num_remaining_bytes];
> // remove the remainder of the message from din   
>
> int num_read = din.read(b);
> {code}
> This will crash the QuorumCnxManager thread, so the cluster will keep going 
> but future elections might fail to converge (ditto for leaving/joining 
> members). 
> Patch coming up in a bit.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (ZOOKEEPER-2186) QuorumCnxManager#receiveConnection may crash with random input

2015-05-13 Thread Raul Gutierrez Segales (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raul Gutierrez Segales updated ZOOKEEPER-2186:
--
Attachment: ZOOKEEPER-2186.patch

Document what the InitialMessage is meant to do. 

> QuorumCnxManager#receiveConnection may crash with random input
> --
>
> Key: ZOOKEEPER-2186
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2186
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.4.6, 3.5.0
>Reporter: Raul Gutierrez Segales
>Assignee: Raul Gutierrez Segales
> Fix For: 3.4.7, 3.5.1, 3.6.0
>
> Attachments: ZOOKEEPER-2186.patch, ZOOKEEPER-2186.patch, 
> ZOOKEEPER-2186.patch
>
>
> This will allocate an arbitrarily large byte buffer (and try to read it!):
> {code}
> public boolean receiveConnection(Socket sock) {
> Long sid = null;
> ...
> sid = din.readLong();
> // next comes the #bytes in the remainder of the message  
>
> int num_remaining_bytes = din.readInt();
> byte[] b = new byte[num_remaining_bytes];
> // remove the remainder of the message from din   
>
> int num_read = din.read(b);
> {code}
> This will crash the QuorumCnxManager thread, so the cluster will keep going 
> but future elections might fail to converge (ditto for leaving/joining 
> members). 
> Patch coming up in a bit.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (ZOOKEEPER-2186) QuorumCnxManager#receiveConnection may crash with random input

2015-05-14 Thread Raul Gutierrez Segales (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raul Gutierrez Segales updated ZOOKEEPER-2186:
--
Attachment: ZOOKEEPER-2186-v3.4.patch

Here's the patch [~michim]. Note that it's much smaller because there is almost 
now parsing going on in 3.4 (i.e.: sending hostname:port is a 3.5/reconfig 
thing). 

Also note that boundary checks for what has to be read is a bit different too, 
i.e.: 0 remaining bytes is fine because old servers might not sending anything 
else besides there sid (afaik). 

cc: [~shralex], [~hdeng]

> QuorumCnxManager#receiveConnection may crash with random input
> --
>
> Key: ZOOKEEPER-2186
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2186
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.4.6, 3.5.0
>Reporter: Raul Gutierrez Segales
>Assignee: Raul Gutierrez Segales
> Fix For: 3.4.7, 3.5.1, 3.6.0
>
> Attachments: ZOOKEEPER-2186-v3.4.patch, ZOOKEEPER-2186.patch, 
> ZOOKEEPER-2186.patch, ZOOKEEPER-2186.patch
>
>
> This will allocate an arbitrarily large byte buffer (and try to read it!):
> {code}
> public boolean receiveConnection(Socket sock) {
> Long sid = null;
> ...
> sid = din.readLong();
> // next comes the #bytes in the remainder of the message  
>
> int num_remaining_bytes = din.readInt();
> byte[] b = new byte[num_remaining_bytes];
> // remove the remainder of the message from din   
>
> int num_read = din.read(b);
> {code}
> This will crash the QuorumCnxManager thread, so the cluster will keep going 
> but future elections might fail to converge (ditto for leaving/joining 
> members). 
> Patch coming up in a bit.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-2183) Change test port assignments to improve uniqueness of ports for multiple concurrent test processes on the same host.

2015-05-14 Thread Raul Gutierrez Segales (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2183?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14543426#comment-14543426
 ] 

Raul Gutierrez Segales commented on ZOOKEEPER-2183:
---

+1 — thanks [~cnauroth]!

> Change test port assignments to improve uniqueness of ports for multiple 
> concurrent test processes on the same host.
> 
>
> Key: ZOOKEEPER-2183
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2183
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: tests
>Affects Versions: 3.5.0
>Reporter: Chris Nauroth
>Assignee: Chris Nauroth
> Fix For: 3.5.1, 3.6.0
>
> Attachments: ZOOKEEPER-2183.001.patch, ZOOKEEPER-2183.002.patch, 
> ZOOKEEPER-2183.003.patch, ZOOKEEPER-2183.004.patch, ZOOKEEPER-2183.005.patch, 
> threads-change.patch
>
>
> Tests use {{PortAssignment#unique}} for assignment of the ports to bind 
> during tests.  Currently, this method works by using a monotonically 
> increasing counter from a static starting point.  Generally, this is 
> sufficient to achieve uniqueness within a single JVM process, but it does not 
> achieve uniqueness across multiple processes on the same host.  This can 
> cause tests to get bind errors if there are multiple pre-commit jobs running 
> concurrently on the same Jenkins host.  This also prevents running tests in 
> parallel to improve the speed of pre-commit runs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-2163) Introduce new ZNode type: container

2015-05-14 Thread Raul Gutierrez Segales (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14543471#comment-14543471
 ] 

Raul Gutierrez Segales commented on ZOOKEEPER-2163:
---

(mentioned this to Jordan over IRC, commenting for others):

i am happy with how much cleaner/simpler this is looking though i raised two 
more issues and a couple of nits in the last rb. 

> Introduce new ZNode type: container
> ---
>
> Key: ZOOKEEPER-2163
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2163
> Project: ZooKeeper
>  Issue Type: New Feature
>  Components: c client, java client, server
>Affects Versions: 3.5.0
>Reporter: Jordan Zimmerman
>Assignee: Jordan Zimmerman
> Fix For: 3.6.0
>
> Attachments: zookeeper-2163.3.patch, zookeeper-2163.5.patch, 
> zookeeper-2163.6.patch
>
>
> BACKGROUND
> 
> A recurring problem for ZooKeeper users is garbage collection of parent 
> nodes. Many recipes (e.g. locks, leaders, etc.) call for the creation of a 
> parent node under which participants create sequential nodes. When the 
> participant is done, it deletes its node. In practice, the ZooKeeper tree 
> begins to fill up with orphaned parent nodes that are no longer needed. The 
> ZooKeeper APIs don’t provide a way to clean these. Over time, ZooKeeper can 
> become unstable due to the number of these nodes.
> CURRENT SOLUTIONS
> ===
> Apache Curator has a workaround solution for this by providing the Reaper 
> class which runs in the background looking for orphaned parent nodes and 
> deleting them. This isn’t ideal and it would be better if ZooKeeper supported 
> this directly.
> PROPOSAL
> =
> ZOOKEEPER-723 and ZOOKEEPER-834 have been proposed to allow EPHEMERAL nodes 
> to contain child nodes. This is not optimum as EPHEMERALs are tied to a 
> session and the general use case of parent nodes is for PERSISTENT nodes. 
> This proposal adds a new node type, CONTAINER. A CONTAINER node is the same 
> as a PERSISTENT node with the additional property that when its last child is 
> deleted, it is deleted (and CONTAINER nodes recursively up the tree are 
> deleted if empty).
> CANONICAL USAGE
> 
> {code}
> while ( true) { // or some reasonable limit
> try {
> zk.create(path, ...);
> break;
> } catch ( KeeperException.NoNodeException e ) {
> try {
> zk.createContainer(containerPath, ...);
> } catch ( KeeperException.NodeExistsException ignore) {
>}
> }
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-2190) In StandaloneDisabledTest, testReconfig() shouldn't take leaving servers as joining servers

2015-05-14 Thread Raul Gutierrez Segales (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14544239#comment-14544239
 ] 

Raul Gutierrez Segales commented on ZOOKEEPER-2190:
---

+1, i can merge this once it's good to go [~hdeng].

> In StandaloneDisabledTest, testReconfig() shouldn't take leaving servers as 
> joining servers
> ---
>
> Key: ZOOKEEPER-2190
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2190
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: tests
>Reporter: Hongchao Deng
>Assignee: Hongchao Deng
> Attachments: ZOOKEEPER-2190.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (ZOOKEEPER-2190) In StandaloneDisabledTest, testReconfig() shouldn't take leaving servers as joining servers

2015-05-14 Thread Raul Gutierrez Segales (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raul Gutierrez Segales updated ZOOKEEPER-2190:
--
Fix Version/s: 3.6.0
   3.5.2

> In StandaloneDisabledTest, testReconfig() shouldn't take leaving servers as 
> joining servers
> ---
>
> Key: ZOOKEEPER-2190
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2190
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: tests
>Reporter: Hongchao Deng
>Assignee: Hongchao Deng
> Fix For: 3.5.2, 3.6.0
>
> Attachments: ZOOKEEPER-2190.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-2126) Improve exit log messsage of EventThread and SendThread by adding SessionId

2015-05-14 Thread Raul Gutierrez Segales (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14544897#comment-14544897
 ] 

Raul Gutierrez Segales commented on ZOOKEEPER-2126:
---

+1.

> Improve exit log messsage of EventThread and SendThread by adding SessionId
> ---
>
> Key: ZOOKEEPER-2126
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2126
> Project: ZooKeeper
>  Issue Type: Improvement
>Affects Versions: 3.6.0
>Reporter: zhihai xu
>Assignee: surendra singh lilhore
> Fix For: 3.4.7, 3.5.1, 3.6.0
>
> Attachments: ZOOKEEPER-2126.patch, ZOOKEEPER-2126_1.patch, 
> ZOOKEEPER-2126_2.patch
>
>
> We saw the following out of order log when close Zookeeper client session.
> {code}
> 2015-02-16 06:01:12,985 INFO org.apache.zookeeper.ZooKeeper: Session: 
> 0x24b8df4044005d4 closed
> .
> 2015-02-16 06:01:12,995 INFO org.apache.zookeeper.ClientCnxn: EventThread 
> shut down
> {code}
> This logs are very confusing if a new Zookeeper client session is created 
> between these two logs. We may think new Zookeeper client session shutdown it 
> EventThread instead of the old closed Zookeeper client session.
> Should we wait for sendThread and eventThread died in the ClientCnxn.close?
> We can add the following code in ClientCnxn.close.
> {code}
> sendThread.join(timeout);
> eventThread.join(timeout);
> {code}
> with the change, we won't interleave old closed session with new session.
> We can also create a new close API to support this so we won't affect the old 
> code if people use old close API.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-602) log all exceptions not caught by ZK threads

2015-05-14 Thread Raul Gutierrez Segales (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14544907#comment-14544907
 ] 

Raul Gutierrez Segales commented on ZOOKEEPER-602:
--

+1, thanks for backporting this [~rakeshr] (i re-read the patch, although i had 
reviewed it for trunk as well).

one note though (applicable to trunk as well):

in ZooKeeperThread#handleException:

{code}
protected void handleException(String thName, Throwable e) {
LOG.warn("Exception occured from thread {}", thName, e);
}
{code}

i think that should be LOG.error, given it's unhandled. what do you think?

happy to merge this, unless someone else wants to have another look (cc: 
[~phunt], [~hdeng]).


> log all exceptions not caught by ZK threads
> ---
>
> Key: ZOOKEEPER-602
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-602
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: java client, server
>Affects Versions: 3.2.1
>Reporter: Patrick Hunt
>Assignee: Rakesh R
>Priority: Critical
> Fix For: 3.5.0
>
> Attachments: ZOOKEEPER-602-br3-4.patch, ZOOKEEPER-602.patch, 
> ZOOKEEPER-602.patch, ZOOKEEPER-602.patch, ZOOKEEPER-602.patch, 
> ZOOKEEPER-602.patch, ZOOKEEPER-602.patch, ZOOKEEPER-602.patch
>
>
> the java code should add a ThreadGroup exception handler that logs at ERROR 
> level any uncaught exceptions thrown by Thread run methods.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-602) log all exceptions not caught by ZK threads

2015-05-16 Thread Raul Gutierrez Segales (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14546851#comment-14546851
 ] 

Raul Gutierrez Segales commented on ZOOKEEPER-602:
--

ah - fair enough. 

> log all exceptions not caught by ZK threads
> ---
>
> Key: ZOOKEEPER-602
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-602
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: java client, server
>Affects Versions: 3.2.1
>Reporter: Patrick Hunt
>Assignee: Rakesh R
>Priority: Critical
> Fix For: 3.5.0
>
> Attachments: ZOOKEEPER-602-br3-4.patch, ZOOKEEPER-602.patch, 
> ZOOKEEPER-602.patch, ZOOKEEPER-602.patch, ZOOKEEPER-602.patch, 
> ZOOKEEPER-602.patch, ZOOKEEPER-602.patch, ZOOKEEPER-602.patch
>
>
> the java code should add a ThreadGroup exception handler that logs at ERROR 
> level any uncaught exceptions thrown by Thread run methods.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-602) log all exceptions not caught by ZK threads

2015-05-16 Thread Raul Gutierrez Segales (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14546926#comment-14546926
 ] 

Raul Gutierrez Segales commented on ZOOKEEPER-602:
--

[~fpj], [~hdeng]: could i get another +1 here so i can merge this? thx!

> log all exceptions not caught by ZK threads
> ---
>
> Key: ZOOKEEPER-602
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-602
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: java client, server
>Affects Versions: 3.2.1
>Reporter: Patrick Hunt
>Assignee: Rakesh R
>Priority: Blocker
> Fix For: 3.4.7, 3.5.0
>
> Attachments: ZOOKEEPER-602-br3-4.patch, ZOOKEEPER-602.patch, 
> ZOOKEEPER-602.patch, ZOOKEEPER-602.patch, ZOOKEEPER-602.patch, 
> ZOOKEEPER-602.patch, ZOOKEEPER-602.patch, ZOOKEEPER-602.patch
>
>
> the java code should add a ThreadGroup exception handler that logs at ERROR 
> level any uncaught exceptions thrown by Thread run methods.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-1077) C client lib doesn't build on Solaris

2015-05-18 Thread Raul Gutierrez Segales (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14547640#comment-14547640
 ] 

Raul Gutierrez Segales commented on ZOOKEEPER-1077:
---

Sorry for the slowness [~cnauroth] - will get this merged today (after another 
+1). 

> C client lib doesn't build on Solaris
> -
>
> Key: ZOOKEEPER-1077
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1077
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: build, c client
>Affects Versions: 3.3.4
> Environment: uname -a: SunOS [redacted] 5.10 Generic_142910-17 i86pc 
> i386 i86pc
> GNU toolchain (gcc 3.4.3, GNU Make etc.)
>Reporter: Tadeusz Andrzej Kadłubowski
>Assignee: Chris Nauroth
>Priority: Critical
> Fix For: 3.4.7, 3.5.2, 3.6.0
>
> Attachments: ZOOKEEPER-1077-branch-3.4.002.patch, 
> ZOOKEEPER-1077.001.patch, ZOOKEEPER-1077.002.patch, zookeeper.patch
>
>
> Hello,
> Some minor trouble with building ZooKeeper C client library on 
> Sun^H^H^HOracle Solaris 5.10.
> 1. You need to link against "-lnsl -lsocket"
> 2. ctime_r needs a buffer size. The signature is: "char *ctime_r(const time_t 
> *clock, char *buf, int buflen)"
> 3. In zk_log.c you need to manually cast pid_t to int (-Werror can be 
> cumbersome ;) )
> 4. getpwuid_r()returns pointer to struct passwd, which works as the last 
> parameter on Linux.
> Solaris signature: struct passwd *getpwuid_r(uid_t  uid,  struct  passwd  
> *pwd, char *buffer, int  buflen); 
> Linux signature: int getpwuid_r(uid_t uid, struct passwd *pwd, char *buf, 
> size_t buflen, struct passwd **result);



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-2156) If JAVA_HOME is not set zk startup and fetching status command execution result misleads user.

2015-05-18 Thread Raul Gutierrez Segales (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14547724#comment-14547724
 ] 

Raul Gutierrez Segales commented on ZOOKEEPER-2156:
---

lgtm, except for the indentation in this block:

{code}
-  JAVA=java
+echo "Error: JAVA_HOME is not set and java could not be found in PATH." 1>&2
+exit 1
 fi
{code}

any reason for not indenting that echo statement properly?

> If JAVA_HOME is not set zk startup and fetching status command execution 
> result misleads user.
> --
>
> Key: ZOOKEEPER-2156
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2156
> Project: ZooKeeper
>  Issue Type: Bug
>Reporter: J.Andreina
>Assignee: J.Andreina
> Fix For: 3.5.2, 3.6.0
>
> Attachments: ZOOKEEPER-2156.1.patch, ZOOKEEPER-2156.2.patch, 
> ZOOKEEPER-2156.3.patch, ZOOKEEPER-2156.4.patch
>
>
> If JAVA_HOME is not set,  zk startup and fetching status command execution 
> result misleads user.
> 1. Eventhough zk startup has failed since JAVA_HOME is not set , on CLI it 
> displays that zk STARTED.
> {noformat}
> #:~/Apr3rd/zookeeper-3.4.6/bin> ./zkServer.sh start
> JMX enabled by default
> Using config: /home/REX/Apr3rd/zookeeper-3.4.6/bin/../conf/zoo.cfg
> Starting zookeeper ... STARTED
> {noformat}
> 2.  Fetching zk status when JAVA_HOME is not set displays that process not 
> running .
> {noformat}
> #:~/Apr3rd/zookeeper-3.4.6/bin> ./zkServer.sh status
> JMX enabled by default
> Using config: /home/REX/Apr3rd/zookeeper-3.4.6/bin/../conf/zoo.cfg
> Error contacting service. It is probably not running.
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-2156) If JAVA_HOME is not set zk startup and fetching status command execution result misleads user.

2015-05-18 Thread Raul Gutierrez Segales (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14547725#comment-14547725
 ] 

Raul Gutierrez Segales commented on ZOOKEEPER-2156:
---

lgtm, except for the indentation in this block:

{code}
-  JAVA=java
+echo "Error: JAVA_HOME is not set and java could not be found in PATH." 1>&2
+exit 1
 fi
{code}

any reason for not indenting that echo statement properly?

> If JAVA_HOME is not set zk startup and fetching status command execution 
> result misleads user.
> --
>
> Key: ZOOKEEPER-2156
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2156
> Project: ZooKeeper
>  Issue Type: Bug
>Reporter: J.Andreina
>Assignee: J.Andreina
> Fix For: 3.5.2, 3.6.0
>
> Attachments: ZOOKEEPER-2156.1.patch, ZOOKEEPER-2156.2.patch, 
> ZOOKEEPER-2156.3.patch, ZOOKEEPER-2156.4.patch
>
>
> If JAVA_HOME is not set,  zk startup and fetching status command execution 
> result misleads user.
> 1. Eventhough zk startup has failed since JAVA_HOME is not set , on CLI it 
> displays that zk STARTED.
> {noformat}
> #:~/Apr3rd/zookeeper-3.4.6/bin> ./zkServer.sh start
> JMX enabled by default
> Using config: /home/REX/Apr3rd/zookeeper-3.4.6/bin/../conf/zoo.cfg
> Starting zookeeper ... STARTED
> {noformat}
> 2.  Fetching zk status when JAVA_HOME is not set displays that process not 
> running .
> {noformat}
> #:~/Apr3rd/zookeeper-3.4.6/bin> ./zkServer.sh status
> JMX enabled by default
> Using config: /home/REX/Apr3rd/zookeeper-3.4.6/bin/../conf/zoo.cfg
> Error contacting service. It is probably not running.
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-2156) If JAVA_HOME is not set zk startup and fetching status command execution result misleads user.

2015-05-18 Thread Raul Gutierrez Segales (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14547728#comment-14547728
 ] 

Raul Gutierrez Segales commented on ZOOKEEPER-2156:
---

lgtm, except for the indentation in this block:

{code}
-  JAVA=java
+echo "Error: JAVA_HOME is not set and java could not be found in PATH." 1>&2
+exit 1
 fi
{code}

any reason for not indenting that echo statement properly?

> If JAVA_HOME is not set zk startup and fetching status command execution 
> result misleads user.
> --
>
> Key: ZOOKEEPER-2156
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2156
> Project: ZooKeeper
>  Issue Type: Bug
>Reporter: J.Andreina
>Assignee: J.Andreina
> Fix For: 3.5.2, 3.6.0
>
> Attachments: ZOOKEEPER-2156.1.patch, ZOOKEEPER-2156.2.patch, 
> ZOOKEEPER-2156.3.patch, ZOOKEEPER-2156.4.patch
>
>
> If JAVA_HOME is not set,  zk startup and fetching status command execution 
> result misleads user.
> 1. Eventhough zk startup has failed since JAVA_HOME is not set , on CLI it 
> displays that zk STARTED.
> {noformat}
> #:~/Apr3rd/zookeeper-3.4.6/bin> ./zkServer.sh start
> JMX enabled by default
> Using config: /home/REX/Apr3rd/zookeeper-3.4.6/bin/../conf/zoo.cfg
> Starting zookeeper ... STARTED
> {noformat}
> 2.  Fetching zk status when JAVA_HOME is not set displays that process not 
> running .
> {noformat}
> #:~/Apr3rd/zookeeper-3.4.6/bin> ./zkServer.sh status
> JMX enabled by default
> Using config: /home/REX/Apr3rd/zookeeper-3.4.6/bin/../conf/zoo.cfg
> Error contacting service. It is probably not running.
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-832) Invalid session id causes infinite loop during automatic reconnect

2015-05-26 Thread Raul Gutierrez Segales (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14560408#comment-14560408
 ] 

Raul Gutierrez Segales commented on ZOOKEEPER-832:
--

Hmm, given this introduces some risk (i.e.: in a catastrophic failure scenario 
we'd be expiring sessions, which would allow them to reconnect, which might 
mean they'll see invalid data) should this new behavior be opt-in/gated by a 
property?

Another way to handle would be a purely client-side solution (not sure if this 
was already proposed, haven't checked all the backlog) like Kazoo does it: 
https://github.com/python-zk/kazoo/blob/master/kazoo/client.py#L129. That is, 
have a max_retries param after which the client would give up. By default this 
would be 0 (infinite), so the current behavior would be retained. 



> Invalid session id causes infinite loop during automatic reconnect
> --
>
> Key: ZOOKEEPER-832
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-832
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.3.1, 3.4.5, 3.5.0
> Environment: All
>Reporter: Ryan Holmes
>Assignee: Germán Blanco
>Priority: Blocker
> Fix For: 3.5.2, 3.6.0
>
> Attachments: ZOOKEEPER-832.patch, ZOOKEEPER-832.patch, 
> ZOOKEEPER-832.patch, ZOOKEEPER-832.patch, ZOOKEEPER-832.patch
>
>
> Steps to reproduce:
> 1.) Connect to a standalone server using the Java client.
> 2.) Stop the server.
> 3.) Delete the contents of the data directory (i.e. the persisted session 
> data).
> 4.) Start the server.
> The client now automatically tries to reconnect but the server refuses the 
> connection because the session id is invalid. The client and server are now 
> in an infinite loop of attempted and rejected connections. While this 
> situation represents a catastrophic failure and the current behavior is not 
> incorrect, it appears that there is no way to detect this situation on the 
> client and therefore no way to recover.
> The suggested improvement is to send an event to the default watcher 
> indicating that the current state is "session invalid", similar to how the 
> "session expired" state is handled.
> Server log output (repeats indefinitely):
> 2010-08-05 11:48:08,283 - INFO  
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn$Factory@250] - 
> Accepted socket connection from /127.0.0.1:63292
> 2010-08-05 11:48:08,284 - INFO  
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@751] - Refusing 
> session request for client /127.0.0.1:63292 as it has seen zxid 0x44 our last 
> zxid is 0x0 client must try another server
> 2010-08-05 11:48:08,284 - INFO  
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1434] - Closed 
> socket connection for client /127.0.0.1:63292 (no session established for 
> client)
> Client log output (repeats indefinitely):
> 11:47:17 org.apache.zookeeper.ClientCnxn startConnect INFO line 1000 - 
> Opening socket connection to server localhost/127.0.0.1:2181
> 11:47:17 org.apache.zookeeper.ClientCnxn run WARN line 1120 - Session 
> 0x12a3ae4e893000a for server null, unexpected error, closing socket 
> connection and attempting reconnect
> java.net.ConnectException: Connection refused
>   at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>   at 
> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
>   at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1078)
> 11:47:17 org.apache.zookeeper.ClientCnxn cleanup DEBUG line 1167 - Ignoring 
> exception during shutdown input
> java.nio.channels.ClosedChannelException
>   at 
> sun.nio.ch.SocketChannelImpl.shutdownInput(SocketChannelImpl.java:638)
>   at sun.nio.ch.SocketAdaptor.shutdownInput(SocketAdaptor.java:360)
>   at 
> org.apache.zookeeper.ClientCnxn$SendThread.cleanup(ClientCnxn.java:1164)
>   at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1129)
> 11:47:17 org.apache.zookeeper.ClientCnxn cleanup DEBUG line 1174 - Ignoring 
> exception during shutdown output
> java.nio.channels.ClosedChannelException
>   at 
> sun.nio.ch.SocketChannelImpl.shutdownOutput(SocketChannelImpl.java:649)
>   at sun.nio.ch.SocketAdaptor.shutdownOutput(SocketAdaptor.java:368)
>   at 
> org.apache.zookeeper.ClientCnxn$SendThread.cleanup(ClientCnxn.java:1171)
>   at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1129)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-832) Invalid session id causes infinite loop during automatic reconnect

2015-05-26 Thread Raul Gutierrez Segales (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14560458#comment-14560458
 ] 

Raul Gutierrez Segales commented on ZOOKEEPER-832:
--

Well, most users do setup a handler to create a new session when the 
current/old one expires (i.e.: Curator, for instance, does this; so does Kazoo, 
etc).

> Invalid session id causes infinite loop during automatic reconnect
> --
>
> Key: ZOOKEEPER-832
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-832
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.3.1, 3.4.5, 3.5.0
> Environment: All
>Reporter: Ryan Holmes
>Assignee: Germán Blanco
>Priority: Blocker
> Fix For: 3.5.2, 3.6.0
>
> Attachments: ZOOKEEPER-832.patch, ZOOKEEPER-832.patch, 
> ZOOKEEPER-832.patch, ZOOKEEPER-832.patch, ZOOKEEPER-832.patch
>
>
> Steps to reproduce:
> 1.) Connect to a standalone server using the Java client.
> 2.) Stop the server.
> 3.) Delete the contents of the data directory (i.e. the persisted session 
> data).
> 4.) Start the server.
> The client now automatically tries to reconnect but the server refuses the 
> connection because the session id is invalid. The client and server are now 
> in an infinite loop of attempted and rejected connections. While this 
> situation represents a catastrophic failure and the current behavior is not 
> incorrect, it appears that there is no way to detect this situation on the 
> client and therefore no way to recover.
> The suggested improvement is to send an event to the default watcher 
> indicating that the current state is "session invalid", similar to how the 
> "session expired" state is handled.
> Server log output (repeats indefinitely):
> 2010-08-05 11:48:08,283 - INFO  
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn$Factory@250] - 
> Accepted socket connection from /127.0.0.1:63292
> 2010-08-05 11:48:08,284 - INFO  
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@751] - Refusing 
> session request for client /127.0.0.1:63292 as it has seen zxid 0x44 our last 
> zxid is 0x0 client must try another server
> 2010-08-05 11:48:08,284 - INFO  
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1434] - Closed 
> socket connection for client /127.0.0.1:63292 (no session established for 
> client)
> Client log output (repeats indefinitely):
> 11:47:17 org.apache.zookeeper.ClientCnxn startConnect INFO line 1000 - 
> Opening socket connection to server localhost/127.0.0.1:2181
> 11:47:17 org.apache.zookeeper.ClientCnxn run WARN line 1120 - Session 
> 0x12a3ae4e893000a for server null, unexpected error, closing socket 
> connection and attempting reconnect
> java.net.ConnectException: Connection refused
>   at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>   at 
> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
>   at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1078)
> 11:47:17 org.apache.zookeeper.ClientCnxn cleanup DEBUG line 1167 - Ignoring 
> exception during shutdown input
> java.nio.channels.ClosedChannelException
>   at 
> sun.nio.ch.SocketChannelImpl.shutdownInput(SocketChannelImpl.java:638)
>   at sun.nio.ch.SocketAdaptor.shutdownInput(SocketAdaptor.java:360)
>   at 
> org.apache.zookeeper.ClientCnxn$SendThread.cleanup(ClientCnxn.java:1164)
>   at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1129)
> 11:47:17 org.apache.zookeeper.ClientCnxn cleanup DEBUG line 1174 - Ignoring 
> exception during shutdown output
> java.nio.channels.ClosedChannelException
>   at 
> sun.nio.ch.SocketChannelImpl.shutdownOutput(SocketChannelImpl.java:649)
>   at sun.nio.ch.SocketAdaptor.shutdownOutput(SocketAdaptor.java:368)
>   at 
> org.apache.zookeeper.ClientCnxn$SendThread.cleanup(ClientCnxn.java:1171)
>   at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1129)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-832) Invalid session id causes infinite loop during automatic reconnect

2015-05-26 Thread Raul Gutierrez Segales (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14560460#comment-14560460
 ] 

Raul Gutierrez Segales commented on ZOOKEEPER-832:
--

If a cluster travels back in time because of server-side intervention by the 
Operator (i.e.: data deletion), it seems that it should require a client-side 
intervention as well to restore things. Or, if we want this to be automatic it 
should be opt-in. 

> Invalid session id causes infinite loop during automatic reconnect
> --
>
> Key: ZOOKEEPER-832
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-832
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.3.1, 3.4.5, 3.5.0
> Environment: All
>Reporter: Ryan Holmes
>Assignee: Germán Blanco
>Priority: Blocker
> Fix For: 3.5.2, 3.6.0
>
> Attachments: ZOOKEEPER-832.patch, ZOOKEEPER-832.patch, 
> ZOOKEEPER-832.patch, ZOOKEEPER-832.patch, ZOOKEEPER-832.patch
>
>
> Steps to reproduce:
> 1.) Connect to a standalone server using the Java client.
> 2.) Stop the server.
> 3.) Delete the contents of the data directory (i.e. the persisted session 
> data).
> 4.) Start the server.
> The client now automatically tries to reconnect but the server refuses the 
> connection because the session id is invalid. The client and server are now 
> in an infinite loop of attempted and rejected connections. While this 
> situation represents a catastrophic failure and the current behavior is not 
> incorrect, it appears that there is no way to detect this situation on the 
> client and therefore no way to recover.
> The suggested improvement is to send an event to the default watcher 
> indicating that the current state is "session invalid", similar to how the 
> "session expired" state is handled.
> Server log output (repeats indefinitely):
> 2010-08-05 11:48:08,283 - INFO  
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn$Factory@250] - 
> Accepted socket connection from /127.0.0.1:63292
> 2010-08-05 11:48:08,284 - INFO  
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@751] - Refusing 
> session request for client /127.0.0.1:63292 as it has seen zxid 0x44 our last 
> zxid is 0x0 client must try another server
> 2010-08-05 11:48:08,284 - INFO  
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1434] - Closed 
> socket connection for client /127.0.0.1:63292 (no session established for 
> client)
> Client log output (repeats indefinitely):
> 11:47:17 org.apache.zookeeper.ClientCnxn startConnect INFO line 1000 - 
> Opening socket connection to server localhost/127.0.0.1:2181
> 11:47:17 org.apache.zookeeper.ClientCnxn run WARN line 1120 - Session 
> 0x12a3ae4e893000a for server null, unexpected error, closing socket 
> connection and attempting reconnect
> java.net.ConnectException: Connection refused
>   at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>   at 
> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
>   at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1078)
> 11:47:17 org.apache.zookeeper.ClientCnxn cleanup DEBUG line 1167 - Ignoring 
> exception during shutdown input
> java.nio.channels.ClosedChannelException
>   at 
> sun.nio.ch.SocketChannelImpl.shutdownInput(SocketChannelImpl.java:638)
>   at sun.nio.ch.SocketAdaptor.shutdownInput(SocketAdaptor.java:360)
>   at 
> org.apache.zookeeper.ClientCnxn$SendThread.cleanup(ClientCnxn.java:1164)
>   at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1129)
> 11:47:17 org.apache.zookeeper.ClientCnxn cleanup DEBUG line 1174 - Ignoring 
> exception during shutdown output
> java.nio.channels.ClosedChannelException
>   at 
> sun.nio.ch.SocketChannelImpl.shutdownOutput(SocketChannelImpl.java:649)
>   at sun.nio.ch.SocketAdaptor.shutdownOutput(SocketAdaptor.java:368)
>   at 
> org.apache.zookeeper.ClientCnxn$SendThread.cleanup(ClientCnxn.java:1171)
>   at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1129)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (ZOOKEEPER-832) Invalid session id causes infinite loop during automatic reconnect

2015-05-26 Thread Raul Gutierrez Segales (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raul Gutierrez Segales reassigned ZOOKEEPER-832:


Assignee: Raul Gutierrez Segales  (was: Germán Blanco)

> Invalid session id causes infinite loop during automatic reconnect
> --
>
> Key: ZOOKEEPER-832
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-832
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.3.1, 3.4.5, 3.5.0
> Environment: All
>Reporter: Ryan Holmes
>Assignee: Raul Gutierrez Segales
>Priority: Blocker
> Fix For: 3.5.2, 3.6.0
>
> Attachments: ZOOKEEPER-832.patch, ZOOKEEPER-832.patch, 
> ZOOKEEPER-832.patch, ZOOKEEPER-832.patch, ZOOKEEPER-832.patch
>
>
> Steps to reproduce:
> 1.) Connect to a standalone server using the Java client.
> 2.) Stop the server.
> 3.) Delete the contents of the data directory (i.e. the persisted session 
> data).
> 4.) Start the server.
> The client now automatically tries to reconnect but the server refuses the 
> connection because the session id is invalid. The client and server are now 
> in an infinite loop of attempted and rejected connections. While this 
> situation represents a catastrophic failure and the current behavior is not 
> incorrect, it appears that there is no way to detect this situation on the 
> client and therefore no way to recover.
> The suggested improvement is to send an event to the default watcher 
> indicating that the current state is "session invalid", similar to how the 
> "session expired" state is handled.
> Server log output (repeats indefinitely):
> 2010-08-05 11:48:08,283 - INFO  
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn$Factory@250] - 
> Accepted socket connection from /127.0.0.1:63292
> 2010-08-05 11:48:08,284 - INFO  
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@751] - Refusing 
> session request for client /127.0.0.1:63292 as it has seen zxid 0x44 our last 
> zxid is 0x0 client must try another server
> 2010-08-05 11:48:08,284 - INFO  
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1434] - Closed 
> socket connection for client /127.0.0.1:63292 (no session established for 
> client)
> Client log output (repeats indefinitely):
> 11:47:17 org.apache.zookeeper.ClientCnxn startConnect INFO line 1000 - 
> Opening socket connection to server localhost/127.0.0.1:2181
> 11:47:17 org.apache.zookeeper.ClientCnxn run WARN line 1120 - Session 
> 0x12a3ae4e893000a for server null, unexpected error, closing socket 
> connection and attempting reconnect
> java.net.ConnectException: Connection refused
>   at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>   at 
> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
>   at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1078)
> 11:47:17 org.apache.zookeeper.ClientCnxn cleanup DEBUG line 1167 - Ignoring 
> exception during shutdown input
> java.nio.channels.ClosedChannelException
>   at 
> sun.nio.ch.SocketChannelImpl.shutdownInput(SocketChannelImpl.java:638)
>   at sun.nio.ch.SocketAdaptor.shutdownInput(SocketAdaptor.java:360)
>   at 
> org.apache.zookeeper.ClientCnxn$SendThread.cleanup(ClientCnxn.java:1164)
>   at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1129)
> 11:47:17 org.apache.zookeeper.ClientCnxn cleanup DEBUG line 1174 - Ignoring 
> exception during shutdown output
> java.nio.channels.ClosedChannelException
>   at 
> sun.nio.ch.SocketChannelImpl.shutdownOutput(SocketChannelImpl.java:649)
>   at sun.nio.ch.SocketAdaptor.shutdownOutput(SocketAdaptor.java:368)
>   at 
> org.apache.zookeeper.ClientCnxn$SendThread.cleanup(ClientCnxn.java:1171)
>   at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1129)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (ZOOKEEPER-832) Invalid session id causes infinite loop during automatic reconnect

2015-05-26 Thread Raul Gutierrez Segales (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raul Gutierrez Segales updated ZOOKEEPER-832:
-
Assignee: Germán Blanco  (was: Raul Gutierrez Segales)

> Invalid session id causes infinite loop during automatic reconnect
> --
>
> Key: ZOOKEEPER-832
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-832
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.3.1, 3.4.5, 3.5.0
> Environment: All
>Reporter: Ryan Holmes
>Assignee: Germán Blanco
>Priority: Blocker
> Fix For: 3.5.2, 3.6.0
>
> Attachments: ZOOKEEPER-832.patch, ZOOKEEPER-832.patch, 
> ZOOKEEPER-832.patch, ZOOKEEPER-832.patch, ZOOKEEPER-832.patch
>
>
> Steps to reproduce:
> 1.) Connect to a standalone server using the Java client.
> 2.) Stop the server.
> 3.) Delete the contents of the data directory (i.e. the persisted session 
> data).
> 4.) Start the server.
> The client now automatically tries to reconnect but the server refuses the 
> connection because the session id is invalid. The client and server are now 
> in an infinite loop of attempted and rejected connections. While this 
> situation represents a catastrophic failure and the current behavior is not 
> incorrect, it appears that there is no way to detect this situation on the 
> client and therefore no way to recover.
> The suggested improvement is to send an event to the default watcher 
> indicating that the current state is "session invalid", similar to how the 
> "session expired" state is handled.
> Server log output (repeats indefinitely):
> 2010-08-05 11:48:08,283 - INFO  
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn$Factory@250] - 
> Accepted socket connection from /127.0.0.1:63292
> 2010-08-05 11:48:08,284 - INFO  
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@751] - Refusing 
> session request for client /127.0.0.1:63292 as it has seen zxid 0x44 our last 
> zxid is 0x0 client must try another server
> 2010-08-05 11:48:08,284 - INFO  
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1434] - Closed 
> socket connection for client /127.0.0.1:63292 (no session established for 
> client)
> Client log output (repeats indefinitely):
> 11:47:17 org.apache.zookeeper.ClientCnxn startConnect INFO line 1000 - 
> Opening socket connection to server localhost/127.0.0.1:2181
> 11:47:17 org.apache.zookeeper.ClientCnxn run WARN line 1120 - Session 
> 0x12a3ae4e893000a for server null, unexpected error, closing socket 
> connection and attempting reconnect
> java.net.ConnectException: Connection refused
>   at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>   at 
> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
>   at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1078)
> 11:47:17 org.apache.zookeeper.ClientCnxn cleanup DEBUG line 1167 - Ignoring 
> exception during shutdown input
> java.nio.channels.ClosedChannelException
>   at 
> sun.nio.ch.SocketChannelImpl.shutdownInput(SocketChannelImpl.java:638)
>   at sun.nio.ch.SocketAdaptor.shutdownInput(SocketAdaptor.java:360)
>   at 
> org.apache.zookeeper.ClientCnxn$SendThread.cleanup(ClientCnxn.java:1164)
>   at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1129)
> 11:47:17 org.apache.zookeeper.ClientCnxn cleanup DEBUG line 1174 - Ignoring 
> exception during shutdown output
> java.nio.channels.ClosedChannelException
>   at 
> sun.nio.ch.SocketChannelImpl.shutdownOutput(SocketChannelImpl.java:649)
>   at sun.nio.ch.SocketAdaptor.shutdownOutput(SocketAdaptor.java:368)
>   at 
> org.apache.zookeeper.ClientCnxn$SendThread.cleanup(ClientCnxn.java:1171)
>   at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1129)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-832) Invalid session id causes infinite loop during automatic reconnect

2015-05-28 Thread Raul Gutierrez Segales (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14563054#comment-14563054
 ] 

Raul Gutierrez Segales commented on ZOOKEEPER-832:
--

Yeah — that's kind of what I was thinking. So we would have an option like:

{code}
zookeeper.killSessionsWithBadZxid
{code}

which would activate the code in the patch. If that's disabled (the default), 
then things stay as is. And in the future we can turn it on by default, it 
makes sense to people.

What do you think Germán? Also,thanks for working on this!

> Invalid session id causes infinite loop during automatic reconnect
> --
>
> Key: ZOOKEEPER-832
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-832
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.4.5, 3.5.0
> Environment: All
>Reporter: Ryan Holmes
>Assignee: Germán Blanco
>Priority: Blocker
> Fix For: 3.4.7, 3.5.2, 3.6.0
>
> Attachments: ZOOKEEPER-832.patch, ZOOKEEPER-832.patch, 
> ZOOKEEPER-832.patch, ZOOKEEPER-832.patch, ZOOKEEPER-832.patch
>
>
> Steps to reproduce:
> 1.) Connect to a standalone server using the Java client.
> 2.) Stop the server.
> 3.) Delete the contents of the data directory (i.e. the persisted session 
> data).
> 4.) Start the server.
> The client now automatically tries to reconnect but the server refuses the 
> connection because the session id is invalid. The client and server are now 
> in an infinite loop of attempted and rejected connections. While this 
> situation represents a catastrophic failure and the current behavior is not 
> incorrect, it appears that there is no way to detect this situation on the 
> client and therefore no way to recover.
> The suggested improvement is to send an event to the default watcher 
> indicating that the current state is "session invalid", similar to how the 
> "session expired" state is handled.
> Server log output (repeats indefinitely):
> 2010-08-05 11:48:08,283 - INFO  
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn$Factory@250] - 
> Accepted socket connection from /127.0.0.1:63292
> 2010-08-05 11:48:08,284 - INFO  
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@751] - Refusing 
> session request for client /127.0.0.1:63292 as it has seen zxid 0x44 our last 
> zxid is 0x0 client must try another server
> 2010-08-05 11:48:08,284 - INFO  
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1434] - Closed 
> socket connection for client /127.0.0.1:63292 (no session established for 
> client)
> Client log output (repeats indefinitely):
> 11:47:17 org.apache.zookeeper.ClientCnxn startConnect INFO line 1000 - 
> Opening socket connection to server localhost/127.0.0.1:2181
> 11:47:17 org.apache.zookeeper.ClientCnxn run WARN line 1120 - Session 
> 0x12a3ae4e893000a for server null, unexpected error, closing socket 
> connection and attempting reconnect
> java.net.ConnectException: Connection refused
>   at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>   at 
> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
>   at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1078)
> 11:47:17 org.apache.zookeeper.ClientCnxn cleanup DEBUG line 1167 - Ignoring 
> exception during shutdown input
> java.nio.channels.ClosedChannelException
>   at 
> sun.nio.ch.SocketChannelImpl.shutdownInput(SocketChannelImpl.java:638)
>   at sun.nio.ch.SocketAdaptor.shutdownInput(SocketAdaptor.java:360)
>   at 
> org.apache.zookeeper.ClientCnxn$SendThread.cleanup(ClientCnxn.java:1164)
>   at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1129)
> 11:47:17 org.apache.zookeeper.ClientCnxn cleanup DEBUG line 1174 - Ignoring 
> exception during shutdown output
> java.nio.channels.ClosedChannelException
>   at 
> sun.nio.ch.SocketChannelImpl.shutdownOutput(SocketChannelImpl.java:649)
>   at sun.nio.ch.SocketAdaptor.shutdownOutput(SocketAdaptor.java:368)
>   at 
> org.apache.zookeeper.ClientCnxn$SendThread.cleanup(ClientCnxn.java:1171)
>   at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1129)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-832) Invalid session id causes infinite loop during automatic reconnect

2015-05-28 Thread Raul Gutierrez Segales (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14563056#comment-14563056
 ] 

Raul Gutierrez Segales commented on ZOOKEEPER-832:
--

Yeah — that's kind of what I was thinking. So we would have an option like:

{code}
zookeeper.killSessionsWithBadZxid
{code}

which would activate the code in the patch. If that's disabled (the default), 
then things stay as is. And in the future we can turn it on by default, it 
makes sense to people.

What do you think Germán? Also,thanks for working on this!

> Invalid session id causes infinite loop during automatic reconnect
> --
>
> Key: ZOOKEEPER-832
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-832
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.4.5, 3.5.0
> Environment: All
>Reporter: Ryan Holmes
>Assignee: Germán Blanco
>Priority: Blocker
> Fix For: 3.4.7, 3.5.2, 3.6.0
>
> Attachments: ZOOKEEPER-832.patch, ZOOKEEPER-832.patch, 
> ZOOKEEPER-832.patch, ZOOKEEPER-832.patch, ZOOKEEPER-832.patch
>
>
> Steps to reproduce:
> 1.) Connect to a standalone server using the Java client.
> 2.) Stop the server.
> 3.) Delete the contents of the data directory (i.e. the persisted session 
> data).
> 4.) Start the server.
> The client now automatically tries to reconnect but the server refuses the 
> connection because the session id is invalid. The client and server are now 
> in an infinite loop of attempted and rejected connections. While this 
> situation represents a catastrophic failure and the current behavior is not 
> incorrect, it appears that there is no way to detect this situation on the 
> client and therefore no way to recover.
> The suggested improvement is to send an event to the default watcher 
> indicating that the current state is "session invalid", similar to how the 
> "session expired" state is handled.
> Server log output (repeats indefinitely):
> 2010-08-05 11:48:08,283 - INFO  
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn$Factory@250] - 
> Accepted socket connection from /127.0.0.1:63292
> 2010-08-05 11:48:08,284 - INFO  
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@751] - Refusing 
> session request for client /127.0.0.1:63292 as it has seen zxid 0x44 our last 
> zxid is 0x0 client must try another server
> 2010-08-05 11:48:08,284 - INFO  
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1434] - Closed 
> socket connection for client /127.0.0.1:63292 (no session established for 
> client)
> Client log output (repeats indefinitely):
> 11:47:17 org.apache.zookeeper.ClientCnxn startConnect INFO line 1000 - 
> Opening socket connection to server localhost/127.0.0.1:2181
> 11:47:17 org.apache.zookeeper.ClientCnxn run WARN line 1120 - Session 
> 0x12a3ae4e893000a for server null, unexpected error, closing socket 
> connection and attempting reconnect
> java.net.ConnectException: Connection refused
>   at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>   at 
> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
>   at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1078)
> 11:47:17 org.apache.zookeeper.ClientCnxn cleanup DEBUG line 1167 - Ignoring 
> exception during shutdown input
> java.nio.channels.ClosedChannelException
>   at 
> sun.nio.ch.SocketChannelImpl.shutdownInput(SocketChannelImpl.java:638)
>   at sun.nio.ch.SocketAdaptor.shutdownInput(SocketAdaptor.java:360)
>   at 
> org.apache.zookeeper.ClientCnxn$SendThread.cleanup(ClientCnxn.java:1164)
>   at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1129)
> 11:47:17 org.apache.zookeeper.ClientCnxn cleanup DEBUG line 1174 - Ignoring 
> exception during shutdown output
> java.nio.channels.ClosedChannelException
>   at 
> sun.nio.ch.SocketChannelImpl.shutdownOutput(SocketChannelImpl.java:649)
>   at sun.nio.ch.SocketAdaptor.shutdownOutput(SocketAdaptor.java:368)
>   at 
> org.apache.zookeeper.ClientCnxn$SendThread.cleanup(ClientCnxn.java:1171)
>   at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1129)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-832) Invalid session id causes infinite loop during automatic reconnect

2015-05-28 Thread Raul Gutierrez Segales (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14563055#comment-14563055
 ] 

Raul Gutierrez Segales commented on ZOOKEEPER-832:
--

Yeah — that's kind of what I was thinking. So we would have an option like:

{code}
zookeeper.killSessionsWithBadZxid
{code}

which would activate the code in the patch. If that's disabled (the default), 
then things stay as is. And in the future we can turn it on by default, it 
makes sense to people.

What do you think Germán? Also,thanks for working on this!

> Invalid session id causes infinite loop during automatic reconnect
> --
>
> Key: ZOOKEEPER-832
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-832
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.4.5, 3.5.0
> Environment: All
>Reporter: Ryan Holmes
>Assignee: Germán Blanco
>Priority: Blocker
> Fix For: 3.4.7, 3.5.2, 3.6.0
>
> Attachments: ZOOKEEPER-832.patch, ZOOKEEPER-832.patch, 
> ZOOKEEPER-832.patch, ZOOKEEPER-832.patch, ZOOKEEPER-832.patch
>
>
> Steps to reproduce:
> 1.) Connect to a standalone server using the Java client.
> 2.) Stop the server.
> 3.) Delete the contents of the data directory (i.e. the persisted session 
> data).
> 4.) Start the server.
> The client now automatically tries to reconnect but the server refuses the 
> connection because the session id is invalid. The client and server are now 
> in an infinite loop of attempted and rejected connections. While this 
> situation represents a catastrophic failure and the current behavior is not 
> incorrect, it appears that there is no way to detect this situation on the 
> client and therefore no way to recover.
> The suggested improvement is to send an event to the default watcher 
> indicating that the current state is "session invalid", similar to how the 
> "session expired" state is handled.
> Server log output (repeats indefinitely):
> 2010-08-05 11:48:08,283 - INFO  
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn$Factory@250] - 
> Accepted socket connection from /127.0.0.1:63292
> 2010-08-05 11:48:08,284 - INFO  
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@751] - Refusing 
> session request for client /127.0.0.1:63292 as it has seen zxid 0x44 our last 
> zxid is 0x0 client must try another server
> 2010-08-05 11:48:08,284 - INFO  
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1434] - Closed 
> socket connection for client /127.0.0.1:63292 (no session established for 
> client)
> Client log output (repeats indefinitely):
> 11:47:17 org.apache.zookeeper.ClientCnxn startConnect INFO line 1000 - 
> Opening socket connection to server localhost/127.0.0.1:2181
> 11:47:17 org.apache.zookeeper.ClientCnxn run WARN line 1120 - Session 
> 0x12a3ae4e893000a for server null, unexpected error, closing socket 
> connection and attempting reconnect
> java.net.ConnectException: Connection refused
>   at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>   at 
> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
>   at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1078)
> 11:47:17 org.apache.zookeeper.ClientCnxn cleanup DEBUG line 1167 - Ignoring 
> exception during shutdown input
> java.nio.channels.ClosedChannelException
>   at 
> sun.nio.ch.SocketChannelImpl.shutdownInput(SocketChannelImpl.java:638)
>   at sun.nio.ch.SocketAdaptor.shutdownInput(SocketAdaptor.java:360)
>   at 
> org.apache.zookeeper.ClientCnxn$SendThread.cleanup(ClientCnxn.java:1164)
>   at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1129)
> 11:47:17 org.apache.zookeeper.ClientCnxn cleanup DEBUG line 1174 - Ignoring 
> exception during shutdown output
> java.nio.channels.ClosedChannelException
>   at 
> sun.nio.ch.SocketChannelImpl.shutdownOutput(SocketChannelImpl.java:649)
>   at sun.nio.ch.SocketAdaptor.shutdownOutput(SocketAdaptor.java:368)
>   at 
> org.apache.zookeeper.ClientCnxn$SendThread.cleanup(ClientCnxn.java:1171)
>   at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1129)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-832) Invalid session id causes infinite loop during automatic reconnect

2015-05-28 Thread Raul Gutierrez Segales (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14563059#comment-14563059
 ] 

Raul Gutierrez Segales commented on ZOOKEEPER-832:
--

What's your intuition on why the C test is failing? It's this snippet right:

{code}
void setServers(const string new_hosts)
{
int rc = zoo_set_servers(zh, new_hosts.c_str());
CPPUNIT_ASSERT_EQUAL((int)ZOK, rc);
}
{code}

?

Happy to help with the C code. 

> Invalid session id causes infinite loop during automatic reconnect
> --
>
> Key: ZOOKEEPER-832
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-832
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.4.5, 3.5.0
> Environment: All
>Reporter: Ryan Holmes
>Assignee: Germán Blanco
>Priority: Blocker
> Fix For: 3.4.7, 3.5.2, 3.6.0
>
> Attachments: ZOOKEEPER-832.patch, ZOOKEEPER-832.patch, 
> ZOOKEEPER-832.patch, ZOOKEEPER-832.patch, ZOOKEEPER-832.patch
>
>
> Steps to reproduce:
> 1.) Connect to a standalone server using the Java client.
> 2.) Stop the server.
> 3.) Delete the contents of the data directory (i.e. the persisted session 
> data).
> 4.) Start the server.
> The client now automatically tries to reconnect but the server refuses the 
> connection because the session id is invalid. The client and server are now 
> in an infinite loop of attempted and rejected connections. While this 
> situation represents a catastrophic failure and the current behavior is not 
> incorrect, it appears that there is no way to detect this situation on the 
> client and therefore no way to recover.
> The suggested improvement is to send an event to the default watcher 
> indicating that the current state is "session invalid", similar to how the 
> "session expired" state is handled.
> Server log output (repeats indefinitely):
> 2010-08-05 11:48:08,283 - INFO  
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn$Factory@250] - 
> Accepted socket connection from /127.0.0.1:63292
> 2010-08-05 11:48:08,284 - INFO  
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@751] - Refusing 
> session request for client /127.0.0.1:63292 as it has seen zxid 0x44 our last 
> zxid is 0x0 client must try another server
> 2010-08-05 11:48:08,284 - INFO  
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1434] - Closed 
> socket connection for client /127.0.0.1:63292 (no session established for 
> client)
> Client log output (repeats indefinitely):
> 11:47:17 org.apache.zookeeper.ClientCnxn startConnect INFO line 1000 - 
> Opening socket connection to server localhost/127.0.0.1:2181
> 11:47:17 org.apache.zookeeper.ClientCnxn run WARN line 1120 - Session 
> 0x12a3ae4e893000a for server null, unexpected error, closing socket 
> connection and attempting reconnect
> java.net.ConnectException: Connection refused
>   at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>   at 
> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
>   at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1078)
> 11:47:17 org.apache.zookeeper.ClientCnxn cleanup DEBUG line 1167 - Ignoring 
> exception during shutdown input
> java.nio.channels.ClosedChannelException
>   at 
> sun.nio.ch.SocketChannelImpl.shutdownInput(SocketChannelImpl.java:638)
>   at sun.nio.ch.SocketAdaptor.shutdownInput(SocketAdaptor.java:360)
>   at 
> org.apache.zookeeper.ClientCnxn$SendThread.cleanup(ClientCnxn.java:1164)
>   at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1129)
> 11:47:17 org.apache.zookeeper.ClientCnxn cleanup DEBUG line 1174 - Ignoring 
> exception during shutdown output
> java.nio.channels.ClosedChannelException
>   at 
> sun.nio.ch.SocketChannelImpl.shutdownOutput(SocketChannelImpl.java:649)
>   at sun.nio.ch.SocketAdaptor.shutdownOutput(SocketAdaptor.java:368)
>   at 
> org.apache.zookeeper.ClientCnxn$SendThread.cleanup(ClientCnxn.java:1171)
>   at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1129)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


<    1   2   3   4   5   6   7   8   9   10   >