Re: [PR] KAFKA-16543:There may be ambiguous deletions in the `cleanupGroupMetadata` when the generation of the group is less than or equal to 0 [kafka]
hudeqi closed pull request #15706: KAFKA-16543:There may be ambiguous deletions in the `cleanupGroupMetadata` when the generation of the group is less than or equal to 0 URL: https://github.com/apache/kafka/pull/15706 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] KAFKA-16543:There may be ambiguous deletions in the `cleanupGroupMetadata` when the generation of the group is less than or equal to 0 [kafka]
hudeqi commented on PR #15706: URL: https://github.com/apache/kafka/pull/15706#issuecomment-2058152299 > Hi @hudeqi. Thanks for the patch. I would like to better understand it. My first question is how would Flink commit Flink with a generationId equal to -1? The generation of the group is only managed by the group. It is not possible to alter it from an external system. The -1 passed in the offset commit request is only used for validation purposes. > > The reason why we don't write a tombstone in this case is because the group was never materialized in the log if it stayed at generation 0. I am not sure it is a worthwhile optimization though. @dajac Thank you for your review. The answer for first question: Flink only uses Kafka to commit and store offsets, and its group is not managed by Kafka. By default, the commit generation value is always -1. Since the generation is only changed when members are managed in Kafka, Flink's generation remains -1 and will not be changed. The answer for second question: In the case of using Kafka only to store and commit offsets, the group is also initialized in the `groupMetadataCache` in memory. In the log, the group's metadata and offsetMetadata are also written to `__consumer_offsets`. Therefore, considering data consistency, they should all be cleaned up when being purged. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] KAFKA-16543:There may be ambiguous deletions in the `cleanupGroupMetadata` when the generation of the group is less than or equal to 0 [kafka]
dajac commented on PR #15706: URL: https://github.com/apache/kafka/pull/15706#issuecomment-2051869550 Hi @hudeqi. Thanks for the patch. I would like to better understand it. My first question is how would Flink commit Flink with a generationId equal to -1? The generation of the group is only managed by the group. It is not possible to alter it from an external system. The -1 passed in the offset commit request is only used for validation purposes. The reason why we don't write a tombstone in this case is because the group was never materialized in the log if it stayed at generation 0. I am not sure it is a worthwhile optimization though. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[PR] KAFKA-16543:There may be ambiguous deletions in the `cleanupGroupMetadata` when the generation of the group is less than or equal to 0 [kafka]
hudeqi opened a new pull request, #15706: URL: https://github.com/apache/kafka/pull/15706 In the `cleanupGroupMetadata` method, tombstone messages is written to delete the group's MetadataKey only when the group is in the Dead state and the generation is greater than 0. The comment indicates: 'We avoid writing the tombstone when the generationId is 0, since this group is only using Kafka for offset storage.' This means that groups that only use Kafka for offset storage should not be deleted. However, there is a situation where, for example, Flink commit offsets with a generationId equal to -1. If the ApiKeys.DELETE_GROUPS is called to delete this group, Flink's group metadata will never be deleted. Yet, the logic above has already cleaned up commitKey by writing tombstone messages with `removedOffsets`. Therefore, the actual manifestation is: the group no longer exists (since the offsets have been cleaned up, there is no possibility of adding the group back to the `groupMetadataCache` unless offsets are committed again with the same group name), but the corresponding g roup metadata information still exists in __consumer_offsets. This leads to the problem that deleting the group does not completely clean up its related information. The group's state is set to Dead only in the following three situations: 1. The group information is unloaded 2. The group is deleted by ApiKeys.DELETE_GROUPS 3. All offsets of the group have expired or removed. Therefore, since the group is already in the Dead state and has been removed from the `groupMetadataCache`, why not directly clean up all the information of the group? Even if it is only used for storing offsets. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org