Lijun,

Thanks for the proposal and I liked your idea of using a compacted topic for 
tiered storage metadata topic.

In our setup, we have set a shorter retention (3 days) for the tiered storage 
metadata topic to control the size growth.  We can do that since we control all 
topic's retention policy in our clusters and we set a uniform retention.policy 
for all our tiered storage topics.  I can see other users/companies will not be 
able to enforce that retention policy to all tiered storage topics.

Some suggestions: In your example scenarios, it would also be good to add an 
example of remote log segment deletion triggered by retention policy which will 
trigger generation of tombstone event into metadata topic and trigger log 
compaction/deletion 24 hour later, I think this is the key event to cap the 
metadata topic size.

For the original unbounded remote_log_metadata topic, I am not sure whether we 
still need it or not.  If it is left only for audit trail purpose, people can 
set up a data ingestion pipeline to ingest the content of metadata topic into a 
separate storage location.  I think we can have a flag to have only one 
metadata topic (the compacted version).


On Monday, January 5, 2026 at 01:22:42 PM PST, Lijun Tong 
<[email protected]> wrote: 





Hello Kafka Community,

I would like to start a discussion on KIP-1266, which proposes to add
another new compacted remote log metadata topic for the tiered storage, to
limit the number of messages that need to be iterated to build the remote
metadata state.

KIP link: KIP-1266 Bounding The Number Of RemoteLogMetadata Messages via
Compacted RemoteLogMetadata Topic
<https://cwiki.apache.org/confluence/display/KAFKA/KIP-1266%3A+Bounding+The+Number+Of+RemoteLogMetadata+Messages+via+Compacted+Topic>

Background:
The current Tiered Storage implementation uses a __remote_log_metadata
topic with infinite retention and delete-based cleanup policy, causing
unbounded growth, slow broker bootstrap, no mechanism to clean up expired
segment metadata, and inefficient re-reading from offset 0 during
leadership changes.

Proposal:
A dual-topic approach that introduces a new __remote_log_metadata_compacted
topic using log compaction with deterministic offset-based keys, while
preserving the existing topic for audit history; this allows brokers to
build their metadata cache exclusively from the compacted topic, enables
cleanup of expired segment metadata through tombstones, and includes a
migration strategy to populate the new topic during upgrade—delivering
bounded metadata growth and faster broker startup while maintaining
backward compatibility.

More details are in the attached KIP link.
Looking forward to your thoughts.

Thank you for your time!

Best,
Lijun Tong

Reply via email to