lvfangmin opened a new pull request #843: ZOOKEEPER-3296: Explicitly closing 
the sslsocket when it failed handshake to prevent issue where peers cannot join 
quorum
URL: https://github.com/apache/zookeeper/pull/843
 
 
   The quorum connection manager is handling connections sequentially with a 
default listen backlog queue size 50, during the network loss, there are socket 
read timed out, which is syncLimit * tickTime, and almost all the following 
connect requests in the backlog queue will timed out from the other side before 
it's being processed. 
   
   Those timed out learners will try to connect to a different server, and 
leaves the connect requests on server side without sending the close_notify 
packet. The server is slowly consuming from these queue with syncLimit * 
tickTime timeout for each of those requests which haven't sent notify_close 
packet. Any new connect requests will be queued up again when there is spot in 
the listen backlog queue, but timed out before the server handles it, and it 
can never successfully finish any new connection, and it failed to join the 
quorum.
   
   Please check the Jira for more details.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

Reply via email to