[ https://issues.apache.org/jira/browse/ZOOKEEPER-2186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15083390#comment-15083390 ]
Markus Aalto commented on ZOOKEEPER-2186: ----------------------------------------- As I see it any member having different protocol version implementation than the existing protocol version (= -65536L) would fail to communicate with the member running this patch (older versions still allowed as the patch provides backward compatibility). This is because of the protocol version check that has been added to the parse() method of the InitialMessage class. Additionally, I think the message format for the Initial message should allow older version to skip data that the newer versions would add to the message. This would require writing the full message length after the sid field in the message. Now the parsing code assumes that the remaining data is fully used for host address and port. This is how it is now, but that assumption will make it difficult to add any new fields to the message without breaking the upgrade path. I hope this explanation clarified my concerns. > QuorumCnxManager#receiveConnection may crash with random input > -------------------------------------------------------------- > > Key: ZOOKEEPER-2186 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2186 > Project: ZooKeeper > Issue Type: Bug > Components: server > Affects Versions: 3.4.6, 3.5.0 > Reporter: Raul Gutierrez Segales > Assignee: Raul Gutierrez Segales > Fix For: 3.4.7, 3.5.1, 3.6.0 > > Attachments: ZOOKEEPER-2186-v3.4.patch, ZOOKEEPER-2186.patch, > ZOOKEEPER-2186.patch, ZOOKEEPER-2186.patch > > > This will allocate an arbitrarily large byte buffer (and try to read it!): > {code} > public boolean receiveConnection(Socket sock) { > Long sid = null; > ... > sid = din.readLong(); > // next comes the #bytes in the remainder of the message > > int num_remaining_bytes = din.readInt(); > byte[] b = new byte[num_remaining_bytes]; > // remove the remainder of the message from din > > int num_read = din.read(b); > {code} > This will crash the QuorumCnxManager thread, so the cluster will keep going > but future elections might fail to converge (ditto for leaving/joining > members). > Patch coming up in a bit. -- This message was sent by Atlassian JIRA (v6.3.4#6332)