[ https://issues.apache.org/jira/browse/KAFKA-3409?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15198710#comment-15198710 ]
Jiangjie Qin commented on KAFKA-3409: ------------------------------------- That makes sense. So it seems that if users see CommitFailedException, there are two possibilities: 1) the consumer hasn't been polling for quite a while so it has been kicked out of the group. 2) user happened to call commit offset while the consumer is going through the rebalance and has passed the PreparingRebalance stage. In (1) user should expect the exception, because the issue was due to the user did not call poll() frequent enough. I am not sure if 2) is a possible scenario in a single threaded model. That might still be possible. After a user calls poll(), the left over state could be something like AwaitingSync. And if user call commitSync() at that point, it will fail. So does it mean even if a consumer is still in the group, it might still see CommitFailedException as long as it is trying to commit offset during a rebalance? Do you think we should document some suggestions for user to correctly handle this exception? If both case are possible, I am wondering if we should throw the RebalanceInProgressException instead of CommitFailedException to user in case (2). Otherwise users might think they are not polling frequent enough which might not be the case. > Mirror maker hangs indefinitely due to commit > ---------------------------------------------- > > Key: KAFKA-3409 > URL: https://issues.apache.org/jira/browse/KAFKA-3409 > Project: Kafka > Issue Type: Bug > Components: tools > Affects Versions: 0.9.0.1 > Environment: Kafka 0.9.0.1 > Reporter: TAO XIAO > > Mirror maker hangs indefinitely upon receiving CommitFailedException. I > believe this is due to CommitFailedException not caught by mirror maker and > mirror maker has no way to recover from it. > A better approach will be catching the exception and rejoin the group. Here > is the stack trace > [2016-03-15 09:34:36,463] ERROR Error UNKNOWN_MEMBER_ID occurred while > committing offsets for group xxxxx > (org.apache.kafka.clients.consumer.internals.ConsumerCoordinator) > [2016-03-15 09:34:36,463] FATAL [mirrormaker-thread-3] Mirror maker thread > failure due to (kafka.tools.MirrorMaker$MirrorMakerThread) > org.apache.kafka.clients.consumer.CommitFailedException: Commit cannot be > completed due to group rebalance > at > org.apache.kafka.clients.consumer.internals.ConsumerCoordinator$OffsetCommitResponseHandler.handle(ConsumerCoordinator.java:552) > at > org.apache.kafka.clients.consumer.internals.ConsumerCoordinator$OffsetCommitResponseHandler.handle(ConsumerCoordinator.java:493) > at > org.apache.kafka.clients.consumer.internals.AbstractCoordinator$CoordinatorResponseHandler.onSuccess(AbstractCoordinator.java:665) > at > org.apache.kafka.clients.consumer.internals.AbstractCoordinator$CoordinatorResponseHandler.onSuccess(AbstractCoordinator.java:644) > at > org.apache.kafka.clients.consumer.internals.RequestFuture$1.onSuccess(RequestFuture.java:167) > at > org.apache.kafka.clients.consumer.internals.RequestFuture.fireSuccess(RequestFuture.java:133) > at > org.apache.kafka.clients.consumer.internals.RequestFuture.complete(RequestFuture.java:107) > at > org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient$RequestFutureCompletionHandler.onComplete(ConsumerNetworkClient.java:380) > at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:274) > at > org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.clientPoll(ConsumerNetworkClient.java:320) > at > org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:213) > at > org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:193) > at > org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:163) > at > org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.commitOffsetsSync(ConsumerCoordinator.java:358) > at > org.apache.kafka.clients.consumer.KafkaConsumer.commitSync(KafkaConsumer.java:968) > at > kafka.tools.MirrorMaker$MirrorMakerNewConsumer.commit(MirrorMaker.scala:548) > at kafka.tools.MirrorMaker$.commitOffsets(MirrorMaker.scala:340) > at > kafka.tools.MirrorMaker$MirrorMakerThread.maybeFlushAndCommitOffsets(MirrorMaker.scala:438) > at > kafka.tools.MirrorMaker$MirrorMakerThread.run(MirrorMaker.scala:399) > [2016-03-15 09:34:36,463] INFO [mirrormaker-thread-3] Flushing producer. > (kafka.tools.MirrorMaker$MirrorMakerThread) > [2016-03-15 09:34:36,464] INFO [mirrormaker-thread-3] Committing consumer > offsets. (kafka.tools.MirrorMaker$MirrorMakerThread) > [2016-03-15 09:34:36,477] ERROR Error UNKNOWN_MEMBER_ID occurred while > committing offsets for group xxxxx > (org.apache.kafka.clients.consumer.internals.ConsumerCoordinator) -- This message was sent by Atlassian JIRA (v6.3.4#6332)