Luke Chen created KAFKA-13563: --------------------------------- Summary: Consumer failure after rolling Broker upgrade Key: KAFKA-13563 URL: https://issues.apache.org/jira/browse/KAFKA-13563 Project: Kafka Issue Type: Bug Components: clients Reporter: Luke Chen Assignee: Luke Chen
This failure occurred again during this month's rolling OS security updates to the Brokers (no change to Broker version). I have also been able to reproduce it locally with the following process: 1. Start a 3 Broker cluster with a Topic having Replicas=3. 2. Start a Client with Producer and Consumer communicating over the Topic. 3. Stop the Broker that is acting as the Group Coordinator. 4. Observe successful Rediscovery of new Group Coordinator. 5. Restart the stopped Broker. 6. Stop the Broker that became the new Group Coordinator at step 4. 7. Observe "Rediscovery will be attempted" message but no "Discovered group coordinator" message. In short, Group Coordinator Rediscovery only works for the first Broker failover not any subsequent failover. I conducted tests using 2.7.1 servers. The issue occurs with 2.7.1 and 2.7.2 Clients. The issue does not occur with 2.5.1 and 2.7.0 Clients. This make me suspect that https://issues.apache.org/jira/browse/KAFKA-10793 introduced this issue. -- This message was sent by Atlassian Jira (v8.20.1#820001)