[ https://issues.apache.org/jira/browse/KAFKA-10429?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Navinder Brar updated KAFKA-10429: ---------------------------------- Summary: Group Coordinator unavailability leads to missing events (was: Group Coordinator is unavailable leads to missing events) > Group Coordinator unavailability leads to missing events > -------------------------------------------------------- > > Key: KAFKA-10429 > URL: https://issues.apache.org/jira/browse/KAFKA-10429 > Project: Kafka > Issue Type: Bug > Components: streams > Affects Versions: 1.1.1 > Reporter: Navinder Brar > Priority: Major > > We are regularly getting this Exception in logs. > [2020-08-25 03:24:59,214] INFO [Consumer > clientId=appId-StreamThread-1-consumer, groupId=dashavatara] Group > coordinator ip:9092 (id: 1452096777 rack: null) is *unavailable* or invalid, > will attempt rediscovery > (org.apache.kafka.clients.consumer.internals.AbstractCoordinator) > > And after sometime it becomes discoverable: > [2020-08-25 03:25:02,218] INFO [Consumer > clientId=appId-c3d1d186-e487-4993-ae3d-5fed75887e6b-StreamThread-1-consumer, > groupId=appId] Discovered group coordinator ip:9092 (id: 1452096777 rack: > null) (org.apache.kafka.clients.consumer.internals.AbstractCoordinator) > > Now, the doubt I have is why this unavailability doesn't trigger a rebalance > in the cluster. We have few hours of retention on the source Kafka Topics and > sometimes this unavailability stays over for more than few hours and since it > doesn't trigger a rebalance or stops processing on other nodes(which are > connected to GC) we never come to know that some issue has happened and till > then we lose events from our source topics. > > There are some resolutions mentioned on stackoverflow but those configs are > already set in our kafka: > default.replication.factor=3 > offsets.topic.replication.factor=3 > > It would be great to understand why this issue is happening and why it > doesn't trigger a rebalance and is there any known solution for it. -- This message was sent by Atlassian Jira (v8.3.4#803005)