Hello Kafka Community, I would like to start a discussion on KIP-1266, which proposes to add another new compacted remote log metadata topic for the tiered storage, to limit the number of messages that need to be iterated to build the remote metadata state.
KIP link: KIP-1266 Bounding The Number Of RemoteLogMetadata Messages via Compacted RemoteLogMetadata Topic <https://cwiki.apache.org/confluence/display/KAFKA/KIP-1266%3A+Bounding+The+Number+Of+RemoteLogMetadata+Messages+via+Compacted+Topic> Background: The current Tiered Storage implementation uses a __remote_log_metadata topic with infinite retention and delete-based cleanup policy, causing unbounded growth, slow broker bootstrap, no mechanism to clean up expired segment metadata, and inefficient re-reading from offset 0 during leadership changes. Proposal: A dual-topic approach that introduces a new __remote_log_metadata_compacted topic using log compaction with deterministic offset-based keys, while preserving the existing topic for audit history; this allows brokers to build their metadata cache exclusively from the compacted topic, enables cleanup of expired segment metadata through tombstones, and includes a migration strategy to populate the new topic during upgradeādelivering bounded metadata growth and faster broker startup while maintaining backward compatibility. More details are in the attached KIP link. Looking forward to your thoughts. Thank you for your time! Best, Lijun Tong
