[
https://issues.apache.org/jira/browse/ZOOKEEPER-2186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15083655#comment-15083655
]
Powell Molleti commented on ZOOKEEPER-2186:
-------------------------------------------
Markus I have come across the same issue and decided to implement this by
sending the same notification. I am working on this as part of ZOOKEEPER-901,
refer some of the discussions about this here ZOOKEEPER-1045.
Let me know what you think about this idea?. I think this has the potential to
solve the user level keep-alive implementation without the need to send new
bits in hdr and/or the to introduce a new message for keep-alive.
However this breaks the current FLE due to this code: http://bit.ly/1PdWY1D
{code:title=FastLeaderElection.java|borderStyle=solid}
// Verify if there is any change in the proposed leader
while((n = recvqueue.poll(finalizeWait,
TimeUnit.MILLISECONDS)) != null){
if(totalOrderPredicate(n.leader, n.zxid, n.peerEpoch,
proposedLeader, proposedZxid, proposedEpoch)){
recvqueue.put(n);
break;
}
}
{code}
I think this while loop is in error, if I am not mistaken, it should use a
global clock limit how long to poll for rather than hoping no one is going send
any messages with-in the finalizeWait time window. I am hoping to negotiate for
a change here if the submitted patch is found to be reasonable.
> QuorumCnxManager#receiveConnection may crash with random input
> --------------------------------------------------------------
>
> Key: ZOOKEEPER-2186
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2186
> Project: ZooKeeper
> Issue Type: Bug
> Components: server
> Affects Versions: 3.4.6, 3.5.0
> Reporter: Raul Gutierrez Segales
> Assignee: Raul Gutierrez Segales
> Fix For: 3.4.7, 3.5.1, 3.6.0
>
> Attachments: ZOOKEEPER-2186-v3.4.patch, ZOOKEEPER-2186.patch,
> ZOOKEEPER-2186.patch, ZOOKEEPER-2186.patch
>
>
> This will allocate an arbitrarily large byte buffer (and try to read it!):
> {code}
> public boolean receiveConnection(Socket sock) {
> Long sid = null;
> ...
> sid = din.readLong();
> // next comes the #bytes in the remainder of the message
>
> int num_remaining_bytes = din.readInt();
> byte[] b = new byte[num_remaining_bytes];
> // remove the remainder of the message from din
>
> int num_read = din.read(b);
> {code}
> This will crash the QuorumCnxManager thread, so the cluster will keep going
> but future elections might fail to converge (ditto for leaving/joining
> members).
> Patch coming up in a bit.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)