[ 
https://issues.apache.org/jira/browse/KAFKA-8421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16857147#comment-16857147
 ] 

Richard Yu commented on KAFKA-8421:
-----------------------------------

I think that we could extend this issue to account for a poll() operation which 
did not begin during a rebalance, but extends into it. Basically, we want to 
break the timeout up so that we can "join" the group, so that technically the 
consumer has received its new assignment, but will not receive it (in the form 
of the SyncGroup and JoinGroup response) until its fetch request is complete. 

Edit to previous comment: We could also halt any ongoing fetch requests should 
rebalance timeout hit its limit and return the data we already have. When every 
fetch request has been completed, we would send all JoinGroup responses 
afterwards (in otherwords, while a fetch request for previously owned 
partitions is still live, we will wait on sending any JoinGroup requests to 
avoid two consumers requesting for records from the same partition). 

> Allow consumer.poll() to return data in the middle of rebalance
> ---------------------------------------------------------------
>
>                 Key: KAFKA-8421
>                 URL: https://issues.apache.org/jira/browse/KAFKA-8421
>             Project: Kafka
>          Issue Type: Improvement
>          Components: consumer
>            Reporter: Guozhang Wang
>            Priority: Major
>
> With KIP-429 in place, today when a consumer is about to send join-group 
> request its owned partitions may not be empty, meaning that some of its 
> fetched data can still be returned. Nevertheless, today the logic is strict:
> {code}
>                     if (!updateAssignmentMetadataIfNeeded(timer)) {
>                         return ConsumerRecords.empty();
>                     }
> {code}
> I.e. if the consumer enters a rebalance it always returns no data. 
> As an optimization, we can consider letting consumers to still return 
> messages that still belong to its owned partitions even when it is within a 
> rebalance, because we know it is safe that no one else would claim those 
> partitions in this rebalance yet, and we can still commit offsets if, after 
> this rebalance, the partitions need to be revoked then.
> One thing we need to take care though is the rebalance timeout, i.e. when 
> consumer's processing those records they may not call the next poll() in time 
> (think: Kafka Streams num.iterations mechanism), which may leads to consumer 
> dropping out of the group during rebalance.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to