[ 
https://issues.apache.org/jira/browse/KAFKA-10429?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navinder Brar updated KAFKA-10429:
----------------------------------
    Summary: Group Coordinator unavailability leads to missing events  (was: 
Group Coordinator is unavailable leads to missing events)

> Group Coordinator unavailability leads to missing events
> --------------------------------------------------------
>
>                 Key: KAFKA-10429
>                 URL: https://issues.apache.org/jira/browse/KAFKA-10429
>             Project: Kafka
>          Issue Type: Bug
>          Components: streams
>    Affects Versions: 1.1.1
>            Reporter: Navinder Brar
>            Priority: Major
>
> We are regularly getting this Exception in logs.
> [2020-08-25 03:24:59,214] INFO [Consumer 
> clientId=appId-StreamThread-1-consumer, groupId=dashavatara] Group 
> coordinator ip:9092 (id: 1452096777 rack: null) is *unavailable* or invalid, 
> will attempt rediscovery 
> (org.apache.kafka.clients.consumer.internals.AbstractCoordinator)
>  
> And after sometime it becomes discoverable:
> [2020-08-25 03:25:02,218] INFO [Consumer 
> clientId=appId-c3d1d186-e487-4993-ae3d-5fed75887e6b-StreamThread-1-consumer, 
> groupId=appId] Discovered group coordinator ip:9092 (id: 1452096777 rack: 
> null) (org.apache.kafka.clients.consumer.internals.AbstractCoordinator)
>  
> Now, the doubt I have is why this unavailability doesn't trigger a rebalance 
> in the cluster. We have few hours of retention on the source Kafka Topics and 
> sometimes this unavailability stays over for more than few hours and since it 
> doesn't trigger a rebalance or stops processing on other nodes(which are 
> connected to GC) we never come to know that some issue has happened and till 
> then we lose events from our source topics. 
>  
> There are some resolutions mentioned on stackoverflow but those configs are 
> already set in our kafka:
> default.replication.factor=3
> offsets.topic.replication.factor=3
>  
> It would be great to understand why this issue is happening and why it 
> doesn't trigger a rebalance and is there any known solution for it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to