Pengwei created KAFKA-5014: ------------------------------ Summary: SSL Channel not ready but tcp is established and the server is hung will not sending metadata Key: KAFKA-5014 URL: https://issues.apache.org/jira/browse/KAFKA-5014 Project: Kafka Issue Type: Bug Affects Versions: 0.10.2.0, 0.9.0.1 Reporter: Pengwei Priority: Minor
In our test env, QA hang one of the connecting broker of the producer, then the producer will be stuck in send method, and throw the exception: fail to update metadata after request timeout. I found the reason as follow: when the producer chose one of the broker to send metadata, it connect to the broker, but the broker is hang, the tcp is connected and Network client marks this broker is connected, but the SSL channel is not ready yet so the channel is not ready. Then the Network client chooses the connected node in the leastLoadedNode every time to send the metadata, but the node's channel is not ready yet. So the producer stuck in getting metadata and will not try another node to request metadata. The client should not stuck only one node is hung -- This message was sent by Atlassian JIRA (v6.3.15#6346)