Luke,

Thanks for taking time to read the design and gave insightful critiques.  Here 
are the replies to your questions, if we align on the direction, we can update 
the KIP for the new suggestions.

LC1.  We are not planning to upload the bloom filter to remote tired storage, 
it is designed to be lazily built locally for the entries in the dirty section 
B1 (the dirty section only in remote).  This is considered to be a smaller 
section comparing to the cleaned section A.  On a long running broker, the 
section A will become bigger and bigger with each compaction cycle and B1 is 
relatively constant.  The reason we decide not to upload the bloom filter index 
is it is just one of the possible optimization data structures.  In the future 
people might come up with other better probabilistic indexing data structure, 
we don't want to leave something persistent and have to worry about migrating 
that in the future;

LC2, for bandwidth control.  We plan to use/extend the built-in throttler in 
the LogCleaner.  The compaction for remote log segment is going to happen in 
chunks which provide a good place to throttle;

LC3, Yes we can have a separate thread pool for Log Cleaners for tiered storage 
topic since log cleaner for tiered storage runs longer;

LC4, We are not going to have new methods in RemoteStorageManager for uploading 
cleaner-offset-checkpoint.  Instead, the cleaner-offset-checkpoints are 
uploaded as part of the RemoteLogSegmentMetadata (similar to how it includes 
segmentLeaderEpochs map), so we are enhancing RemoteLogSegmentMetadata to 
include another map for cleaner-offset-checkpoint for uploading/downloading.  
To access the cleaner-offset-checkpoint map, we are adding a method in 
RemoteLogMetadataManager: cleanerOffsetCheckpointForEpoch(int epoch), similar 
to the existing highestOffsetForEpoch() method;

LC5: Yes we can add metrics to monitor the log compaction for tiered storage 
topics: the metrics to add would be compaction lag (dirty ratio for tiered 
segments), compaction throughput (bytes/sec) time per compaction cycle, and 
number of segments cleaned/uploaded.

Reply via email to