[jira] [Commented] (ZOOKEEPER-2347) Deadlock shutting down zookeeper

2016-01-11 Thread Rakesh R (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15093401#comment-15093401
 ] 

Rakesh R commented on ZOOKEEPER-2347:
-

Thanks [~rgs] for the reviews.

bq. One question though, why use NettyServerCnxnFactory for the test instead of 
the NIO one (which much more used)?
No specific reason. Test scenario has no relation with either Netty or NIO.

bq. Also, how can we validate if the HBase tests now pass?
Sometime back Ted has updated Hbase test status in jira, please see the 
[comments|https://issues.apache.org/jira/browse/ZOOKEEPER-2347?focusedCommentId=15063086&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15063086].
 Thanks [~yuzhih...@gmail.com] for the test results.


> Deadlock shutting down zookeeper
> 
>
> Key: ZOOKEEPER-2347
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2347
> Project: ZooKeeper
>  Issue Type: Bug
>Affects Versions: 3.4.7
>Reporter: Ted Yu
>Assignee: Rakesh R
>Priority: Blocker
> Fix For: 3.4.8
>
> Attachments: ZOOKEEPER-2347-br-3.4.patch, 
> ZOOKEEPER-2347-br-3.4.patch, ZOOKEEPER-2347-br-3.4.patch, 
> ZOOKEEPER-2347-br-3.4.patch, testSplitLogManager.stack
>
>
> HBase recently upgraded to zookeeper 3.4.7
> In one of the tests, TestSplitLogManager, there is reproducible hang at the 
> end of the test.
> Below is snippet from stack trace related to zookeeper:
> {code}
> "main-EventThread" daemon prio=5 tid=0x7fd27488a800 nid=0x6f1f waiting on 
> condition [0x00011834b000]
>java.lang.Thread.State: WAITING (parking)
>   at sun.misc.Unsafe.park(Native Method)
>   - parking to wait for  <0x0007c5b8d3a0> (a 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
>   at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2043)
>   at 
> java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
>   at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:501)
> "main-SendThread(localhost:59510)" daemon prio=5 tid=0x7fd274eb4000 
> nid=0x9513 waiting on condition [0x000118042000]
>java.lang.Thread.State: TIMED_WAITING (sleeping)
>   at java.lang.Thread.sleep(Native Method)
>   at 
> org.apache.zookeeper.client.StaticHostProvider.next(StaticHostProvider.java:101)
>   at 
> org.apache.zookeeper.ClientCnxn$SendThread.startConnect(ClientCnxn.java:997)
>   at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1060)
> "SyncThread:0" prio=5 tid=0x7fd274d02000 nid=0x730f waiting for monitor 
> entry [0x0001170ac000]
>java.lang.Thread.State: BLOCKED (on object monitor)
>   at 
> org.apache.zookeeper.server.ZooKeeperServer.decInProcess(ZooKeeperServer.java:512)
>   - waiting to lock <0x0007c5b62128> (a 
> org.apache.zookeeper.server.ZooKeeperServer)
>   at 
> org.apache.zookeeper.server.FinalRequestProcessor.processRequest(FinalRequestProcessor.java:144)
>   at 
> org.apache.zookeeper.server.SyncRequestProcessor.flush(SyncRequestProcessor.java:200)
>   at 
> org.apache.zookeeper.server.SyncRequestProcessor.run(SyncRequestProcessor.java:131)
> "main-EventThread" daemon prio=5 tid=0x7fd2753a3800 nid=0x711b waiting on 
> condition [0x000117a3]
>java.lang.Thread.State: WAITING (parking)
>   at sun.misc.Unsafe.park(Native Method)
>   - parking to wait for  <0x0007c9b106b8> (a 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
>   at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2043)
>   at 
> java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
>   at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:501)
> "main" prio=5 tid=0x7fd27600 nid=0x1903 in Object.wait() 
> [0x000108aa1000]
>java.lang.Thread.State: WAITING (on object monitor)
>   at java.lang.Object.wait(Native Method)
>   - waiting on <0x0007c5b66400> (a 
> org.apache.zookeeper.server.SyncRequestProcessor)
>   at java.lang.Thread.join(Thread.java:1281)
>   - locked <0x0007c5b66400> (a 
> org.apache.zookeeper.server.SyncRequestProcessor)
>   at java.lang.Thread.join(Thread.java:1355)
>   at 
> org.apache.zookeeper.server.SyncRequestProcessor.shutdown(SyncRequestProcessor.java:213)
>   at 
> org.apache.zookeeper.server.PrepRequestProcessor.shutdown(PrepRequestProcessor.java:770)
>   at 
> org.apache.zookeeper.server.ZooKeeperServer.shutdown(ZooKeeperServer.java:478)
>   - locked <0x0007c5b62128> (a 
> org.apache.zookeeper.server.ZooKeeperServer)
>   at 
> org.apac

[jira] [Commented] (ZOOKEEPER-2347) Deadlock shutting down zookeeper

2016-01-11 Thread Raul Gutierrez Segales (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15093366#comment-15093366
 ] 

Raul Gutierrez Segales commented on ZOOKEEPER-2347:
---

It lgtm - thanks [~rakeshr] and [~fpj]. One question though, why use 
NettyServerCnxnFactory for the test instead of the NIO one (which much more 
used)?

[~cnauroth]: mind taking a look as well?

Also, how can we validate if the HBase tests now pass?

> Deadlock shutting down zookeeper
> 
>
> Key: ZOOKEEPER-2347
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2347
> Project: ZooKeeper
>  Issue Type: Bug
>Affects Versions: 3.4.7
>Reporter: Ted Yu
>Assignee: Rakesh R
>Priority: Blocker
> Fix For: 3.4.8
>
> Attachments: ZOOKEEPER-2347-br-3.4.patch, 
> ZOOKEEPER-2347-br-3.4.patch, ZOOKEEPER-2347-br-3.4.patch, 
> ZOOKEEPER-2347-br-3.4.patch, testSplitLogManager.stack
>
>
> HBase recently upgraded to zookeeper 3.4.7
> In one of the tests, TestSplitLogManager, there is reproducible hang at the 
> end of the test.
> Below is snippet from stack trace related to zookeeper:
> {code}
> "main-EventThread" daemon prio=5 tid=0x7fd27488a800 nid=0x6f1f waiting on 
> condition [0x00011834b000]
>java.lang.Thread.State: WAITING (parking)
>   at sun.misc.Unsafe.park(Native Method)
>   - parking to wait for  <0x0007c5b8d3a0> (a 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
>   at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2043)
>   at 
> java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
>   at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:501)
> "main-SendThread(localhost:59510)" daemon prio=5 tid=0x7fd274eb4000 
> nid=0x9513 waiting on condition [0x000118042000]
>java.lang.Thread.State: TIMED_WAITING (sleeping)
>   at java.lang.Thread.sleep(Native Method)
>   at 
> org.apache.zookeeper.client.StaticHostProvider.next(StaticHostProvider.java:101)
>   at 
> org.apache.zookeeper.ClientCnxn$SendThread.startConnect(ClientCnxn.java:997)
>   at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1060)
> "SyncThread:0" prio=5 tid=0x7fd274d02000 nid=0x730f waiting for monitor 
> entry [0x0001170ac000]
>java.lang.Thread.State: BLOCKED (on object monitor)
>   at 
> org.apache.zookeeper.server.ZooKeeperServer.decInProcess(ZooKeeperServer.java:512)
>   - waiting to lock <0x0007c5b62128> (a 
> org.apache.zookeeper.server.ZooKeeperServer)
>   at 
> org.apache.zookeeper.server.FinalRequestProcessor.processRequest(FinalRequestProcessor.java:144)
>   at 
> org.apache.zookeeper.server.SyncRequestProcessor.flush(SyncRequestProcessor.java:200)
>   at 
> org.apache.zookeeper.server.SyncRequestProcessor.run(SyncRequestProcessor.java:131)
> "main-EventThread" daemon prio=5 tid=0x7fd2753a3800 nid=0x711b waiting on 
> condition [0x000117a3]
>java.lang.Thread.State: WAITING (parking)
>   at sun.misc.Unsafe.park(Native Method)
>   - parking to wait for  <0x0007c9b106b8> (a 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
>   at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2043)
>   at 
> java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
>   at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:501)
> "main" prio=5 tid=0x7fd27600 nid=0x1903 in Object.wait() 
> [0x000108aa1000]
>java.lang.Thread.State: WAITING (on object monitor)
>   at java.lang.Object.wait(Native Method)
>   - waiting on <0x0007c5b66400> (a 
> org.apache.zookeeper.server.SyncRequestProcessor)
>   at java.lang.Thread.join(Thread.java:1281)
>   - locked <0x0007c5b66400> (a 
> org.apache.zookeeper.server.SyncRequestProcessor)
>   at java.lang.Thread.join(Thread.java:1355)
>   at 
> org.apache.zookeeper.server.SyncRequestProcessor.shutdown(SyncRequestProcessor.java:213)
>   at 
> org.apache.zookeeper.server.PrepRequestProcessor.shutdown(PrepRequestProcessor.java:770)
>   at 
> org.apache.zookeeper.server.ZooKeeperServer.shutdown(ZooKeeperServer.java:478)
>   - locked <0x0007c5b62128> (a 
> org.apache.zookeeper.server.ZooKeeperServer)
>   at 
> org.apache.zookeeper.server.NIOServerCnxnFactory.shutdown(NIOServerCnxnFactory.java:266)
>   at 
> org.apache.hadoop.hbase.zookeeper.MiniZooKeeperCluster.shutdown(MiniZooKeeperCluster.java:301)
> {code}
> Note the address (0x0007c5b66400) in the last hunk which seems to 
> indicate some form of deadlock.
> Accord

[jira] [Commented] (ZOOKEEPER-2353) QuorumCnxManager protocol needs to be upgradable with-in a specific Version

2016-01-11 Thread Raul Gutierrez Segales (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15093353#comment-15093353
 ] 

Raul Gutierrez Segales commented on ZOOKEEPER-2353:
---

Tackling changing the serialization mechanism probably needs to be decoupled 
from this. We'll probably have to support Jute forever, so we can start with 
that and then explore using protobuf for server to server messages. 

> QuorumCnxManager protocol needs to be upgradable with-in a specific Version
> ---
>
> Key: ZOOKEEPER-2353
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2353
> Project: ZooKeeper
>  Issue Type: Improvement
>Affects Versions: 3.4.7, 3.5.1
>Reporter: Powell Molleti
>
> Currently 3.5.X sends its hdr as follows:
> {code:title=QuorumCnxManager.java|borderStyle=solid}
> dout.writeLong(PROTOCOL_VERSION);
> dout.writeLong(self.getId());
> String addr = self.getElectionAddress().getHostString() + ":" + 
> self.getElectionAddress().getPort();
> byte[] addr_bytes = addr.getBytes();
> dout.writeInt(addr_bytes.length);
> dout.write(addr_bytes);
> dout.flush();
> {code}
> Since it writes length of host and port byte string there is no simple way to 
> append new fields to this hdr anymore. I.e the rx side has to consider all 
> bytes after sid for host and port parsing, which is what it does here:
> [QuorumCnxManager.InitialMessage.parse(): http://bit.ly/1Q0znpW]
> {code:title=QuorumCnxManager.java|borderStyle=solid}
> sid = din.readLong();
> int remaining = din.readInt();
> if (remaining <= 0 || remaining > maxBuffer) {
> throw new InitialMessageException(
> "Unreasonable buffer length: %s", remaining);
> }
> byte[] b = new byte[remaining];
> int num_read = din.read(b);
> if (num_read != remaining) {
> throw new InitialMessageException(
> "Read only %s bytes out of %s sent by server %s",
> num_read, remaining, sid);
> }
> // FIXME: IPv6 is not supported. Using something like Guava's 
> HostAndPort
> //parser would be good.
> String addr = new String(b);
> String[] host_port = addr.split(":");
> {code}
> This has been captured in the discussion here: ZOOKEEPER-2186.
> Though it is possible to circumvent this problem by various means the request 
> here is to design messages with hdr such that there is no need to bump 
> version number or hack certain fields (i.e figure out if its length of 
> host/port or length of different message etc, in the above case).
> This is the idea here as captured in ZOOKEEPER-2186.
> {code:java}
> dout.writeLong(PROTOCOL_VERSION);
> String addr = self.getElectionAddress().getHostString() + ":" + 
> self.getElectionAddress().getPort();
> byte[] addr_bytes = addr.getBytes();
> // After version write the total length of msg sent by sender.
> dout.writeInt(Long.BYTES + addr_bytes.length);   
> // Write sid afterwards
> dout.writeLong(self.getId());
> // Write length of host/port string   
> dout.writeInt(addr_bytes.length);
> // Write host/port string   
> dout.write(addr_bytes); 
> {code}
> Since total length of the message and length of each variable field is also 
> present it is quite easy to provide backward compatibility, w.r.t to parsing 
> of the message. 
> Older code will read the length of message it knows and ignore the rest. 
> Newer revision(s), that wants to keep things compatible, will only append to 
> hdr and not change the meaning of current fields.
> I am guessing this was the original intent w.r.t the introduction of protocol 
> version here: ZOOKEEPER-1633
> Since 3.4.x code does not parse this and 3.5.x is still in alpha mode perhaps 
> it is possible to consider this change now?.
> Also I would like to propose to carefully consider the option of using 
> protobufs for the next protocol version bump. This will prevent issues like 
> this in the future.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-2353) QuorumCnxManager protocol needs to be upgradable with-in a specific Version

2016-01-11 Thread Alexander Shraer (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15093302#comment-15093302
 ] 

Alexander Shraer commented on ZOOKEEPER-2353:
-

You're absolutely right, its a hack. My intention was to get something working 
for the purpose of ZOOKEEPER-107 and get back to it in a separate JIRA, but I 
never got to it, sorry... 

I think using protobufs (or similar) here and elsewhere is a great idea. 
Currently ZooKeeper uses Jute for client-server messages but apparently
the intention was also to replace it at some point, see ZOOKEEPER-102. One 
concern may be the impact on ZooKeeper performance of 
such serialization libraries - this needs to be evaluated. There were also 
backward compatibility concerns raised in ZK-102.

> QuorumCnxManager protocol needs to be upgradable with-in a specific Version
> ---
>
> Key: ZOOKEEPER-2353
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2353
> Project: ZooKeeper
>  Issue Type: Improvement
>Affects Versions: 3.4.7, 3.5.1
>Reporter: Powell Molleti
>
> Currently 3.5.X sends its hdr as follows:
> {code:title=QuorumCnxManager.java|borderStyle=solid}
> dout.writeLong(PROTOCOL_VERSION);
> dout.writeLong(self.getId());
> String addr = self.getElectionAddress().getHostString() + ":" + 
> self.getElectionAddress().getPort();
> byte[] addr_bytes = addr.getBytes();
> dout.writeInt(addr_bytes.length);
> dout.write(addr_bytes);
> dout.flush();
> {code}
> Since it writes length of host and port byte string there is no simple way to 
> append new fields to this hdr anymore. I.e the rx side has to consider all 
> bytes after sid for host and port parsing, which is what it does here:
> [QuorumCnxManager.InitialMessage.parse(): http://bit.ly/1Q0znpW]
> {code:title=QuorumCnxManager.java|borderStyle=solid}
> sid = din.readLong();
> int remaining = din.readInt();
> if (remaining <= 0 || remaining > maxBuffer) {
> throw new InitialMessageException(
> "Unreasonable buffer length: %s", remaining);
> }
> byte[] b = new byte[remaining];
> int num_read = din.read(b);
> if (num_read != remaining) {
> throw new InitialMessageException(
> "Read only %s bytes out of %s sent by server %s",
> num_read, remaining, sid);
> }
> // FIXME: IPv6 is not supported. Using something like Guava's 
> HostAndPort
> //parser would be good.
> String addr = new String(b);
> String[] host_port = addr.split(":");
> {code}
> This has been captured in the discussion here: ZOOKEEPER-2186.
> Though it is possible to circumvent this problem by various means the request 
> here is to design messages with hdr such that there is no need to bump 
> version number or hack certain fields (i.e figure out if its length of 
> host/port or length of different message etc, in the above case).
> This is the idea here as captured in ZOOKEEPER-2186.
> {code:java}
> dout.writeLong(PROTOCOL_VERSION);
> String addr = self.getElectionAddress().getHostString() + ":" + 
> self.getElectionAddress().getPort();
> byte[] addr_bytes = addr.getBytes();
> // After version write the total length of msg sent by sender.
> dout.writeInt(Long.BYTES + addr_bytes.length);   
> // Write sid afterwards
> dout.writeLong(self.getId());
> // Write length of host/port string   
> dout.writeInt(addr_bytes.length);
> // Write host/port string   
> dout.write(addr_bytes); 
> {code}
> Since total length of the message and length of each variable field is also 
> present it is quite easy to provide backward compatibility, w.r.t to parsing 
> of the message. 
> Older code will read the length of message it knows and ignore the rest. 
> Newer revision(s), that wants to keep things compatible, will only append to 
> hdr and not change the meaning of current fields.
> I am guessing this was the original intent w.r.t the introduction of protocol 
> version here: ZOOKEEPER-1633
> Since 3.4.x code does not parse this and 3.5.x is still in alpha mode perhaps 
> it is possible to consider this change now?.
> Also I would like to propose to carefully consider the option of using 
> protobufs for the next protocol version bump. This will prevent issues like 
> this in the future.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-2353) QuorumCnxManager protocol needs to be upgradable with-in a specific Version

2016-01-11 Thread Akihiro Suda (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15093050#comment-15093050
 ] 

Akihiro Suda commented on ZOOKEEPER-2353:
-

ZOOKEEPER-1931 uses protobuf, so I added the link to this issue.
https://github.com/zk1931/jzab/


> QuorumCnxManager protocol needs to be upgradable with-in a specific Version
> ---
>
> Key: ZOOKEEPER-2353
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2353
> Project: ZooKeeper
>  Issue Type: Improvement
>Affects Versions: 3.4.7, 3.5.1
>Reporter: Powell Molleti
>
> Currently 3.5.X sends its hdr as follows:
> {code:title=QuorumCnxManager.java|borderStyle=solid}
> dout.writeLong(PROTOCOL_VERSION);
> dout.writeLong(self.getId());
> String addr = self.getElectionAddress().getHostString() + ":" + 
> self.getElectionAddress().getPort();
> byte[] addr_bytes = addr.getBytes();
> dout.writeInt(addr_bytes.length);
> dout.write(addr_bytes);
> dout.flush();
> {code}
> Since it writes length of host and port byte string there is no simple way to 
> append new fields to this hdr anymore. I.e the rx side has to consider all 
> bytes after sid for host and port parsing, which is what it does here:
> [QuorumCnxManager.InitialMessage.parse(): http://bit.ly/1Q0znpW]
> {code:title=QuorumCnxManager.java|borderStyle=solid}
> sid = din.readLong();
> int remaining = din.readInt();
> if (remaining <= 0 || remaining > maxBuffer) {
> throw new InitialMessageException(
> "Unreasonable buffer length: %s", remaining);
> }
> byte[] b = new byte[remaining];
> int num_read = din.read(b);
> if (num_read != remaining) {
> throw new InitialMessageException(
> "Read only %s bytes out of %s sent by server %s",
> num_read, remaining, sid);
> }
> // FIXME: IPv6 is not supported. Using something like Guava's 
> HostAndPort
> //parser would be good.
> String addr = new String(b);
> String[] host_port = addr.split(":");
> {code}
> This has been captured in the discussion here: ZOOKEEPER-2186.
> Though it is possible to circumvent this problem by various means the request 
> here is to design messages with hdr such that there is no need to bump 
> version number or hack certain fields (i.e figure out if its length of 
> host/port or length of different message etc, in the above case).
> This is the idea here as captured in ZOOKEEPER-2186.
> {code:java}
> dout.writeLong(PROTOCOL_VERSION);
> String addr = self.getElectionAddress().getHostString() + ":" + 
> self.getElectionAddress().getPort();
> byte[] addr_bytes = addr.getBytes();
> // After version write the total length of msg sent by sender.
> dout.writeInt(Long.BYTES + addr_bytes.length);   
> // Write sid afterwards
> dout.writeLong(self.getId());
> // Write length of host/port string   
> dout.writeInt(addr_bytes.length);
> // Write host/port string   
> dout.write(addr_bytes); 
> {code}
> Since total length of the message and length of each variable field is also 
> present it is quite easy to provide backward compatibility, w.r.t to parsing 
> of the message. 
> Older code will read the length of message it knows and ignore the rest. 
> Newer revision(s), that wants to keep things compatible, will only append to 
> hdr and not change the meaning of current fields.
> I am guessing this was the original intent w.r.t the introduction of protocol 
> version here: ZOOKEEPER-1633
> Since 3.4.x code does not parse this and 3.5.x is still in alpha mode perhaps 
> it is possible to consider this change now?.
> Also I would like to propose to carefully consider the option of using 
> protobufs for the next protocol version bump. This will prevent issues like 
> this in the future.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-1045) Quorum Peer mutual authentication

2016-01-11 Thread Powell Molleti (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15092670#comment-15092670
 ] 

Powell Molleti commented on ZOOKEEPER-1045:
---

Is there much user demand for Kerberos support for inter-zk channels?. Will ZK 
have to always get token from KDC first before authenticating a peer?. I am not 
quite familiar with SASL Java API can you shed some light into the system level 
process. Does this provide encryption of the data traffic using the shared 
secret key?.



> Quorum Peer mutual authentication
> -
>
> Key: ZOOKEEPER-1045
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1045
> Project: ZooKeeper
>  Issue Type: New Feature
>  Components: server
>Reporter: Eugene Koontz
>Assignee: Rakesh R
> Attachments: ZOOKEEPER-1045-00.patch, ZOOKEEPER-1045-Rolling Upgrade 
> Design Proposal.pdf
>
>
> ZOOKEEPER-938 addresses mutual authentication between clients and servers. 
> This bug, on the other hand, is for authentication among quorum peers. 
> Hopefully much of the work done on SASL integration with Zookeeper for 
> ZOOKEEPER-938 can be used as a foundation for this enhancement.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-2186) QuorumCnxManager#receiveConnection may crash with random input

2016-01-11 Thread Powell Molleti (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15092564#comment-15092564
 ] 

Powell Molleti commented on ZOOKEEPER-2186:
---

ZOOKEEPER-2353

> QuorumCnxManager#receiveConnection may crash with random input
> --
>
> Key: ZOOKEEPER-2186
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2186
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.4.6, 3.5.0
>Reporter: Raul Gutierrez Segales
>Assignee: Raul Gutierrez Segales
> Fix For: 3.4.7, 3.5.1, 3.6.0
>
> Attachments: ZOOKEEPER-2186-v3.4.patch, ZOOKEEPER-2186.patch, 
> ZOOKEEPER-2186.patch, ZOOKEEPER-2186.patch
>
>
> This will allocate an arbitrarily large byte buffer (and try to read it!):
> {code}
> public boolean receiveConnection(Socket sock) {
> Long sid = null;
> ...
> sid = din.readLong();
> // next comes the #bytes in the remainder of the message  
>
> int num_remaining_bytes = din.readInt();
> byte[] b = new byte[num_remaining_bytes];
> // remove the remainder of the message from din   
>
> int num_read = din.read(b);
> {code}
> This will crash the QuorumCnxManager thread, so the cluster will keep going 
> but future elections might fail to converge (ditto for leaving/joining 
> members). 
> Patch coming up in a bit.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (ZOOKEEPER-2353) QuorumCnxManager protocol needs to be upgradable with-in a specific Version

2016-01-11 Thread Powell Molleti (JIRA)
Powell Molleti created ZOOKEEPER-2353:
-

 Summary: QuorumCnxManager protocol needs to be upgradable with-in 
a specific Version
 Key: ZOOKEEPER-2353
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2353
 Project: ZooKeeper
  Issue Type: Improvement
Affects Versions: 3.5.1, 3.4.7
Reporter: Powell Molleti


Currently 3.5.X sends its hdr as follows:

{code:title=QuorumCnxManager.java|borderStyle=solid}
dout.writeLong(PROTOCOL_VERSION);
dout.writeLong(self.getId());
String addr = self.getElectionAddress().getHostString() + ":" + 
self.getElectionAddress().getPort();
byte[] addr_bytes = addr.getBytes();
dout.writeInt(addr_bytes.length);
dout.write(addr_bytes);
dout.flush();
{code}

Since it writes length of host and port byte string there is no simple way to 
append new fields to this hdr anymore. I.e the rx side has to consider all 
bytes after sid for host and port parsing, which is what it does here:

[QuorumCnxManager.InitialMessage.parse(): http://bit.ly/1Q0znpW]
{code:title=QuorumCnxManager.java|borderStyle=solid}
sid = din.readLong();

int remaining = din.readInt();
if (remaining <= 0 || remaining > maxBuffer) {
throw new InitialMessageException(
"Unreasonable buffer length: %s", remaining);
}

byte[] b = new byte[remaining];
int num_read = din.read(b);

if (num_read != remaining) {
throw new InitialMessageException(
"Read only %s bytes out of %s sent by server %s",
num_read, remaining, sid);
}

// FIXME: IPv6 is not supported. Using something like Guava's 
HostAndPort
//parser would be good.
String addr = new String(b);
String[] host_port = addr.split(":");
{code}

This has been captured in the discussion here: ZOOKEEPER-2186.
Though it is possible to circumvent this problem by various means the request 
here is to design messages with hdr such that there is no need to bump version 
number or hack certain fields (i.e figure out if its length of host/port or 
length of different message etc, in the above case).

This is the idea here as captured in ZOOKEEPER-2186.
{code:java}
dout.writeLong(PROTOCOL_VERSION);

String addr = self.getElectionAddress().getHostString() + ":" + 
self.getElectionAddress().getPort();
byte[] addr_bytes = addr.getBytes();

// After version write the total length of msg sent by sender.
dout.writeInt(Long.BYTES + addr_bytes.length);   
// Write sid afterwards
dout.writeLong(self.getId());
// Write length of host/port string   
dout.writeInt(addr_bytes.length);
// Write host/port string   
dout.write(addr_bytes); 
{code}

Since total length of the message and length of each variable field is also 
present it is quite easy to provide backward compatibility, w.r.t to parsing of 
the message. 
Older code will read the length of message it knows and ignore the rest. Newer 
revision(s), that wants to keep things compatible, will only append to hdr and 
not change the meaning of current fields.

I am guessing this was the original intent w.r.t the introduction of protocol 
version here: ZOOKEEPER-1633

Since 3.4.x code does not parse this and 3.5.x is still in alpha mode perhaps 
it is possible to consider this change now?.

Also I would like to propose to carefully consider the option of using 
protobufs for the next protocol version bump. This will prevent issues like 
this in the future.





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] zookeeper pull request: fix typo of java docs in OutputArchive.jav...

2016-01-11 Thread ThomasLau
GitHub user ThomasLau opened a pull request:

https://github.com/apache/zookeeper/pull/51

fix typo of java docs in OutputArchive.java



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ThomasLau/zookeeper trunk

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/zookeeper/pull/51.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #51


commit c434332661aaff7cfb2d959457b18ba7ee4ddb2a
Author: Thomas 
Date:   2016-01-11T09:59:48Z

fix typo of java docs in OutputArchive.java




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---