Hi Henry, Thanks for the KIP!
LC1: Does the "Bloom Filter Index" upload to the remote storage? It looks like we don't? Could I know why not? LC2: Do we have any "bandwidth control" mechanism for the remote log compaction? Ideally, it should not impact other high priority tasks like produce/consume. LC3: Will tiered topics have its own compact thread? Is it possible the tiered topics compaction takes long and impacts normal topics compaction? LC4: Will we have a new method in RemoteStorageManager to upload/fetch the cleaner-offset-checkpoint? LC5: I think we should add some metrics to monitor the tiered storage compaction progress. Thank you, Luke On Thu, Jan 15, 2026 at 4:44 AM Henry Haiying Cai via dev < [email protected]> wrote: > Hi Ivan, > > Thanks for the interest and comment in the KIP. As you pointed out, there > is potential savings from reusing the cache in RemoteStorageManager's > plugin and avoid creating more copies of data in files. > > For IY1: For the remote log segment file downloaded for cleaning, there > are actually 2 types of log segment: the segment file which were already > cleaned before (segment A in the diagram) and the segment file which is > dirty and plan to be cleaned in this cycle (segment B1 in the diagram). > Data in segment B1 needs to read twice (the first time to build the index > lookup table and second time to be read and check against the index lookup > table). Data in Segment A we only need to read it for one time since it is > not needed to build the index table. So we do not need to store the data > from segment A in a temp file and reading in streaming fashion from > RemoteStorageManager is fine. For segment B1, there is pros and cons on > whether we store it first as a temp file as you pointed out. I think > overtime the messages in cleaned section (segment A like) will become more > and bigger so the time spend on reading segment B1 will become > comparatively smaller and payback on the optimization might not be > significant overall. > And also as you pointed out, the output segment file would have to be > stored as a file first for the upload to work. > > For IY2: If we want to optimize the reading from segment B1 by reading > from RemoteStorageManager twice and not generating temporary file, we would > need some some kind of guarantee that RemoteStorageManager would be able to > reuse its internal file/chunk cache for the second read. Otherwise the > saving on local file resource (and less page cache) is not worth the cost > of remote reading twice. So I like your idea of adding more methods in > RemoteStorageManager interface, but I would want to add 2 methods to the > interface: > > public interface RemoteStorageManager { > boolean supportsInputLogSegmentCaching(); > InputStream fetchLogSegment(RemoteLogSegmentMetadata, > remoteLogSegmentMetadata, int startPosition, boolean keepForReuse); > } > > Through the first > method supportsInputLogSegmentCaching RemoteStorageManager plugin tells > Kafka that it supports caching/reuse of LogSegment for download; This way > Kafka code can decide to read from RemoteStorageManager twice for dirty log > segment B1. Kafka will call RemoteStorageManager's > new fetchLogSegment method with keepForReuseset to true when it needs to > build the index lookup table for segment B1 and the RemoteStorageManager > code will keep the data longer in the cache. The cache can be a time/size > bound LRU cache to delay the entry eviction. >
