[ 
https://issues.apache.org/jira/browse/KAFKA-3409?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15198710#comment-15198710
 ] 

Jiangjie Qin commented on KAFKA-3409:
-------------------------------------

That makes sense. 

So it seems that if users see CommitFailedException, there are two 
possibilities: 
1) the consumer hasn't been polling for quite a while so it has been kicked out 
of the group.
2) user happened to call commit offset while the consumer is going through the 
rebalance and has passed the PreparingRebalance stage.

In (1) user should expect the exception, because the issue was due to the user 
did not call poll() frequent enough.
I am not sure if 2) is a possible scenario in a single threaded model. That 
might still be possible. After a user calls poll(), the left over state could 
be something like AwaitingSync. And if user call commitSync() at that point, it 
will fail. So does it mean even if a consumer is still in the group, it might 
still see CommitFailedException as long as it is trying to commit offset during 
a rebalance?

Do you think we should document some suggestions for user to correctly handle 
this exception? If both case are possible, I am wondering if we should throw 
the RebalanceInProgressException instead of CommitFailedException to user in 
case (2). Otherwise users might think they are not polling frequent enough 
which might not be the case.


> Mirror maker hangs indefinitely due to commit 
> ----------------------------------------------
>
>                 Key: KAFKA-3409
>                 URL: https://issues.apache.org/jira/browse/KAFKA-3409
>             Project: Kafka
>          Issue Type: Bug
>          Components: tools
>    Affects Versions: 0.9.0.1
>         Environment: Kafka 0.9.0.1
>            Reporter: TAO XIAO
>
> Mirror maker hangs indefinitely upon receiving CommitFailedException. I 
> believe this is due to CommitFailedException not caught by mirror maker and 
> mirror maker has no way to recover from it.
> A better approach will be catching the exception and rejoin the group. Here 
> is the stack trace
> [2016-03-15 09:34:36,463] ERROR Error UNKNOWN_MEMBER_ID occurred while 
> committing offsets for group xxxxx 
> (org.apache.kafka.clients.consumer.internals.ConsumerCoordinator)
> [2016-03-15 09:34:36,463] FATAL [mirrormaker-thread-3] Mirror maker thread 
> failure due to  (kafka.tools.MirrorMaker$MirrorMakerThread)
> org.apache.kafka.clients.consumer.CommitFailedException: Commit cannot be 
> completed due to group rebalance
>         at 
> org.apache.kafka.clients.consumer.internals.ConsumerCoordinator$OffsetCommitResponseHandler.handle(ConsumerCoordinator.java:552)
>         at 
> org.apache.kafka.clients.consumer.internals.ConsumerCoordinator$OffsetCommitResponseHandler.handle(ConsumerCoordinator.java:493)
>         at 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator$CoordinatorResponseHandler.onSuccess(AbstractCoordinator.java:665)
>         at 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator$CoordinatorResponseHandler.onSuccess(AbstractCoordinator.java:644)
>         at 
> org.apache.kafka.clients.consumer.internals.RequestFuture$1.onSuccess(RequestFuture.java:167)
>         at 
> org.apache.kafka.clients.consumer.internals.RequestFuture.fireSuccess(RequestFuture.java:133)
>         at 
> org.apache.kafka.clients.consumer.internals.RequestFuture.complete(RequestFuture.java:107)
>         at 
> org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient$RequestFutureCompletionHandler.onComplete(ConsumerNetworkClient.java:380)
>         at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:274)
>         at 
> org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.clientPoll(ConsumerNetworkClient.java:320)
>         at 
> org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:213)
>         at 
> org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:193)
>         at 
> org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:163)
>         at 
> org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.commitOffsetsSync(ConsumerCoordinator.java:358)
>         at 
> org.apache.kafka.clients.consumer.KafkaConsumer.commitSync(KafkaConsumer.java:968)
>         at 
> kafka.tools.MirrorMaker$MirrorMakerNewConsumer.commit(MirrorMaker.scala:548)
>         at kafka.tools.MirrorMaker$.commitOffsets(MirrorMaker.scala:340)
>         at 
> kafka.tools.MirrorMaker$MirrorMakerThread.maybeFlushAndCommitOffsets(MirrorMaker.scala:438)
>         at 
> kafka.tools.MirrorMaker$MirrorMakerThread.run(MirrorMaker.scala:399)
> [2016-03-15 09:34:36,463] INFO [mirrormaker-thread-3] Flushing producer. 
> (kafka.tools.MirrorMaker$MirrorMakerThread)
> [2016-03-15 09:34:36,464] INFO [mirrormaker-thread-3] Committing consumer 
> offsets. (kafka.tools.MirrorMaker$MirrorMakerThread)
> [2016-03-15 09:34:36,477] ERROR Error UNKNOWN_MEMBER_ID occurred while 
> committing offsets for group xxxxx 
> (org.apache.kafka.clients.consumer.internals.ConsumerCoordinator)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to