[ https://issues.apache.org/jira/browse/KAFKA-4600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16844103#comment-16844103 ]
Guozhang Wang commented on KAFKA-4600: -------------------------------------- [~braedon] Thanks for your feedbacks. The reason I choose to maintain the partition after an unsuccessful revocation is that, in the rebalance protocol, the partition would NOT be re-assigned until it is clear that no one currently owns it -- i.e. it is re-assignable. In the above case, for example, if the `consumer.poll` is called again, then the consumer will send the join-group request again claiming that it still owns partition 1, and hence it would not be re-assigned elsewhere; if a new generation has already be formed before consumer retries `poll`, then its join-group request, or its commit-offset request would all be rejected as a fatal error and the consumer has to clear up all its owned partitions as "lost" and rejoin as a new member. In either case, users do not worry if the have consumed some messages un-safely due to partial revocations, since they either still owns it a no one else would gets messages from this partition, or they have already lost it and even though they may get some messages before trying to heartbeat / commit offset, their committing are doomed to fail, similar like the current situation. Does that make sense to you? > Consumer proceeds on when ConsumerRebalanceListener fails > --------------------------------------------------------- > > Key: KAFKA-4600 > URL: https://issues.apache.org/jira/browse/KAFKA-4600 > Project: Kafka > Issue Type: Bug > Components: consumer > Affects Versions: 0.10.1.1 > Reporter: Braedon Vickers > Priority: Major > > One of the use cases for a ConsumerRebalanceListener is to load state > necessary for processing a partition when it is assigned. However, when > ConsumerRebalanceListener.onPartitionsAssigned() fails for some reason (i.e. > the state isn't loaded), the error is logged and the consumer proceeds on as > if nothing happened, happily consuming messages from the new partition. When > the state is relied upon for correct processing, this can be very bad, e.g. > data loss can occur. > It would be better if the error was propagated up so it could be dealt with > normally. At the very least the assignment should fail so the consumer > doesn't see any messages from the new partitions, and the rebalance can be > reattempted. -- This message was sent by Atlassian JIRA (v7.6.3#76005)