hudeqi created KAFKA-16543:
------------------------------

             Summary: There may be ambiguous deletions in the 
`cleanupGroupMetadata` when the generation of the group is less than or equal 
to 0
                 Key: KAFKA-16543
                 URL: https://issues.apache.org/jira/browse/KAFKA-16543
             Project: Kafka
          Issue Type: Bug
          Components: group-coordinator
    Affects Versions: 3.6.2
            Reporter: hudeqi
            Assignee: hudeqi


In the `cleanupGroupMetadata` method, tombstone messages is written to delete 
the group's MetadataKey only when the group is in the Dead state and the 
generation is greater than 0. The comment indicates: 'We avoid writing the 
tombstone when the generationId is 0, since this group is only using Kafka for 
offset storage.' This means that groups that only use Kafka for offset storage 
should not be deleted. However, there is a situation where, for example, Flink 
commit offsets with a generationId equal to -1. If the ApiKeys.DELETE_GROUPS is 
called to delete this group, Flink's group metadata will never be deleted. Yet, 
the logic above has already cleaned up commitKey by writing tombstone messages 
with removedOffsets. Therefore, the actual manifestation is: the group no 
longer exists (since the offsets have been cleaned up, there is no possibility 
of adding the group back to the `groupMetadataCache` unless offsets are 
committed again with the same group name), but the corresponding group metadata 
information still exists in __consumer_offsets. This leads to the problem that 
deleting the group does not completely clean up its related information.

The group's state is set to Dead only in the following three situations:
1. The group information is unloaded
2. The group is deleted by ApiKeys.DELETE_GROUPS
3. All offsets of the group have expired or removed.

Therefore, since the group is already in the Dead state and has been removed 
from the `groupMetadataCache`, why not directly clean up all the information of 
the group? Even if it is only used for storing offsets.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to