[ https://issues.apache.org/jira/browse/KAFKA-2891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15034222#comment-15034222 ]
Ben Stopford commented on KAFKA-2891: ------------------------------------- [~rsivaram] I found an error in my analysis of KAFKA-2909 meaning that jira refers to actual data loss. KAFKA-2908 remains a client-side issue. This puts more evidence behind your theory that nodes are being killed before data is replicated. I'll be interested to see if this change is stable on Ec2. > Gaps in messages delivered by new consumer after Kafka restart > -------------------------------------------------------------- > > Key: KAFKA-2891 > URL: https://issues.apache.org/jira/browse/KAFKA-2891 > Project: Kafka > Issue Type: Bug > Components: consumer > Affects Versions: 0.9.0.0 > Reporter: Rajini Sivaram > Priority: Critical > > Replication tests when run with the new consumer with SSL/SASL were failing > very often because messages were not being consumed from some topics after a > Kafka restart. The fix in KAFKA-2877 has made this a lot better. But I am > still seeing some failures (less often now) because a small set of messages > are not received after Kafka restart. This failure looks slightly different > from the one before the fix for KAFKA-2877 was applied, hence the new defect. > The test fails because not all acked messages are received by the consumer, > and the number of messages missing are quite small. > [~benstopford] Are the upgrade tests working reliably with KAFKA-2877 now? > Not sure if any of these log entries are important: > {quote} > [2015-11-25 14:41:12,342] INFO SyncGroup for group test-consumer-group failed > due to NOT_COORDINATOR_FOR_GROUP, will find new coordinator and rejoin > (org.apache.kafka.clients.consumer.internals.AbstractCoordinator) > [2015-11-25 14:41:12,342] INFO Marking the coordinator 2147483644 dead. > (org.apache.kafka.clients.consumer.internals.AbstractCoordinator) > [2015-11-25 14:41:12,958] INFO Attempt to join group test-consumer-group > failed due to unknown member id, resetting and retrying. > (org.apache.kafka.clients.consumer.internals.AbstractCoordinator) > [2015-11-25 14:41:42,437] INFO Fetch offset null is out of range, resetting > offset (org.apache.kafka.clients.consumer.internals.Fetcher) > {quote} -- This message was sent by Atlassian JIRA (v6.3.4#6332)