[ 
https://issues.apache.org/jira/browse/KAFKA-17116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17867934#comment-17867934
 ] 

Chia-Ping Tsai commented on KAFKA-17116:
----------------------------------------

{quote}
Currently, we use the ConsumerHeartbeat API to manage the join and leave 
processes of AsyncKafkaConsumer. I propose that we let the consumer generate a 
unique temporary ID to be used for identification by the broker before member 
ID allocation.
{quote}

yes, we are on the same page. Let's extend this idea:

1. this solution will be applied to `ConsumerGroupHeartbeatRequest` only to 
avoid large changes
2. the unique temporary ID is generated by (async) consumer
3. the unique temporary ID is "tagged" field so we don't need to bump version 
4. the unique temporary ID is stored in `ModernGroupMember` as a optional field.
5. coordinator use the unique temporary ID to search member only if "member id 
= unknown" and "group leave" and "temporary ID is defined". That ensures we 
don't run it regularly, and  we don't need to store a map for <temporaryId> -> 
<memberId>. Instead, we iterate all members of group to filter it out. That is 
OK because this scenario is rare (due to above conditions)

With this new unique ID, consumer can send HB to leave group even if the 
response carrying member id is not returned. It can send HB with temporary ID 
to acknowledge the "leave".

> New consumer may not send effective leave group if member ID received after 
> close 
> ----------------------------------------------------------------------------------
>
>                 Key: KAFKA-17116
>                 URL: https://issues.apache.org/jira/browse/KAFKA-17116
>             Project: Kafka
>          Issue Type: Bug
>          Components: clients, consumer
>    Affects Versions: 3.8.0
>            Reporter: Lianet Magrans
>            Assignee: TengYao Chi
>            Priority: Major
>              Labels: kip-848-client-support
>             Fix For: 3.9.0
>
>
> If the new consumer is closed after sending a HB to join, but before 
> receiving the response to it, it will send a leave group request but without 
> member ID (will simply fail with UNKNOWN_MEMBER_ID). This will make that the 
> broker will have a registered new member, for which it will never receive a 
> leave request for it.
>  # consumer.subscribe -> sends HB to join, transitions to JOINING
>  # consumer.close -> will transition to LEAVING and send HB with epoch -1 
> (without waiting for in-flight requests)
>  # consumer receives response to initial HB, containing the assigned member 
> ID. It will simply ignore it because it's not in the group anymore 
> (UNSUBSCRIBED)
> Note that the expectation, with the current logic, and main downsides of this 
> are:
>  # If the case was that the member received partitions on the first HB, those 
> partitions won't be re-assigned (broker waiting for the closed consumer to 
> reconcile them), until the rebalance timeout expires. 
>  # Even if no partitions were assigned to it, the member will remain in the 
> group from the broker point of view (but not from the client POV). The member 
> will be eventually kicked out for not sending HBs, but only when it's session 
> timeout expires.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to