Re: [DISCUSS] KIP-1248: Allow consumer to fetch from remote tiered storage

Andrew Schofield Wed, 07 Jan 2026 02:23:42 -0800

Hi Henry and Tom,
Thanks for the KIP.

AS1: I'm not really very keen on having consumers
fetch directly from remote tiered storage, but that's just my personal
opinion so don't let it dissuade you. If you just have one consumer
interested in ancient data, it sounds like the benefits outlined in the
KIP are compelling, but I worry that, in some situations, you might
end up with a lot of consumers independently fetching small chunks
of segments from tiered storage, kind of spamming the remote
storage with fetch requests. I think that downloading the data to
a central place which can then support the Kafka fetching protocol
directly would be preferable. Maybe KIP-1255 is the way to go,
and I know there's be a discussion on this point already.


AS2: Thanks for putting in some information for share group support.
I can confirm that share groups do support tiered storage.

I think it would be better to say that share groups are explicitly not
supported. In a share group, the share-partition leader is responsible
for distributing the records to the consumers. If the share group is
reading from tiered storage, the population of consumers assigned
to a partition will be fetching those records. Also, the SPL is responsible
for quite complex state management, including timeouts. Pushing
that logic into a nominated consumer seems like the wrong decision to
me. My view is that it is better to maintain the existing design for
share groups and not retrofit remote tiered storage fetch into the
share consumer.

AS3: The KIP introduces a dependency that the consumers must
have a remote storage fetcher plugin which is compatible with the
tiered storage provider for that cluster. Only the right piece of code
will be able to make sense of the RemoteLogSegmentCustomMetadata
in the FetchResponse. Given that we are gradually removing
complexity from the clients and pushing it into brokers, this does seem
like a move in the opposite direction. I wonder whether the community
as a whole feels that this is a good move. Of course, that's what the
voting process is for :)

AS4: The details of the aborted transaction information in the FetchResponse
are quite nasty I feel. This is one of the reasons which transactional record
filtering for share groups is done on the broker. It will be very important to
ensure that the records contained in the data fetched from the remote
tiered storage are exactly those which are expected so that the correct
aborted records are filtered out. I'm sure it can be made to work as you
describe, but just be aware that this area is a bit delicate to get right.

Thanks,
Andrew

On 2026/01/07 00:22:11 Thomas Thornton via dev wrote:
> Hi Kamal,
> 
> Thanks for the feedback and for sharing the draft of KIP-1255 on the
> Remote Read Replica (RRR) approach.
> 
> While we understand the motivation to keep clients thin, we believe
> KIP-1255 presents a different set of trade-offs that shifts complexity
> from the client to the infrastructure layer, potentially introducing
> higher latency and operational toil. Here are some of our thoughts
> comparing the two proposals:
> 
> 1. Although KIP-1255 proposes a lighter implementation, the reading
> latency is inevitably going to be higher since the data must first
> land on the RRR broker before reaching the consumer (S3 -> RRR ->
> Client). The RRR must fetch segments from remote storage and then
> stream them. This "double hop" increases time-to-first-byte and
> doubles the ingress/egress bandwidth load within the cloud environment
> compared to KIP-1254's direct fetch.
> 
> 2. We are concerned about the claim that KIP-1255 eliminates client
> complexity. The server-side routing option specifically details a
> redirect response that the main broker sends to the client, requiring
> the client to issue a subsequent FetchRequest. This implies that
> clients still need to implement logic to: (a) handle this specific
> redirect protocol, (b) manage the transition from a normal broker (for
> new data) to an RRR (for older data), and (c) handle connection pools
> for RRR nodes. If client logic is required anyway, we believe the
> direct-fetch approach (KIP-1254) offers better performance.
> 
> 3. Regarding your concern about compatibility with existing/proposed
> APIs, KIP-1254 explicitly defines support for read_committed consumers
> by having the broker provide aborted transaction lists for local
> filtering, maintaining semantics without client-side index parsing. We
> also outline a feasible path for Share Groups (Queues) via
> broker-mediated acquisition. Conversely, do the Read Replicas in
> KIP-1255 fully support these newer features today (transactions,
> queues), or are they limited to standard log consumption?
> 
> 4. While the architecture diagram shows RRRs connecting to the
> metadata store, there are open questions on the actual assignment
> logic: (a) How are topic partitions distributed across the read
> replicas? (b) The diagram implies AZ affinity, but what is the
> specific mechanism to ensure a consumer is routed to the correct
> AZ-local RRR? (c) When the number of partitions changes, or if an RRR
> node fails, how is the assignment updated dynamically?
> 
> 5. The proposal mentions these nodes are "autoscalable," but will the
> operations for starting and shutting down RRRs be fully automatic
> based on consumer request load? KIP-1254 is operationally "serverless"
> (relying on S3/GCS scaling), whereas KIP-1255 requires managing a
> distinct fleet of brokers.
> 
> 6. Regarding the concern about dependencies, KIP-1254 keeps the core
> client lightweight by using a pluggable interface
> (RemoteStorageFetcher). The specific storage dependencies (e.g., S3 or
> GCS SDKs) are not bundled with the default client; they are loaded
> dynamically via configuration (remote.storage.fetcher.class) only when
> the feature is enabled. Thus, the additional weight is strictly opt-in
> for users who require this specific high-throughput capability.
> 
> Thanks,
> Tom & Henry
> 
> 
> On Thu, Dec 11, 2025 at 6:14 AM Kamal Chandraprakash
> <[email protected]> wrote:
> >
> > Hi Thomas,
> >
> > Went over the KIP-1254
> > <https://cwiki.apache.org/confluence/display/KAFKA/KIP-1254%3A+Kafka+Consumer+Support+for+Remote+Tiered+Storage+Fetch>
> > that
> > describes the changes required on the consumer side.
> >
> > 1. To enable this feature, the clients have to reimplement the logic if
> > they are not using a Java client.
> > 2. The client becomes heavy and requires all the remote storage
> > dependencies.
> > 3. May not be fully compatible with all the existing / proposed client
> > APIs.
> > 4. Did you explore having a light-weight broker in the cluster that serves
> > only remote traffic similar to reading from
> > preferred replica? KIP-1255
> > <https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=399281539>
> >  proposes
> > <https://docs.google.com/presentation/d/10ZZeJ_8RPc-gPXFxQe1KC7VLSTPg0Rb8I4taZr0RktM/edit?slide=id.g282ec88ea22_1_5#slide=id.g282ec88ea22_1_5>
> > the same, it is in draft stage. These brokers may not need much disk /
> > memory,
> > can be kept in the same AZ as consumers and solely serve FETCH requests
> > from remote storage and can be scaled quickly.
> >
> > Thanks,
> > Kamal
> >
> >
> >
> > On Wed, Dec 10, 2025 at 6:04 PM Thomas Thornton via dev <
> > [email protected]> wrote:
> >
> > > HI Stan,
> > >
> > > Thanks for the detailed feedback! We've now published KIP-1254 [1] which 
> > > is
> > > the consumer-side companion to KIP-1248 and addresses your questions in
> > > detail.
> > >
> > > To highlight a few points:
> > >
> > > On storage format coupling: We've added wording to the Version
> > > Compatibility section [2]. This design intentionally shifts segment 
> > > parsing
> > > from broker to consumer to reduce broker load. While this couples 
> > > consumers
> > > to the on-disk format, SupportedStorageFormatVersions ensures graceful
> > > fallback when formats evolve. Consumers remain decoupled from storage
> > > backends (S3/GCS/Azure) via the RemoteStorageFetcher plugin interface. For
> > > proprietary Kafka implementations with different storage formats, this
> > > mechanism allows them to participate - if the client supports their 
> > > format,
> > > direct fetch works; otherwise it falls back gracefully.
> > >
> > > Optional feature: Yes, opt-in via fetch.remote.enabled=false (default). 
> > > See
> > > Consumer Configs [3].
> > >
> > > Forward-compatibility: Covered in Version Compatibility [2]. The client
> > > sends a list of format versions it supports (e.g., ApacheKafkaV1). If a 
> > > 6.x
> > > broker uses a new format not in the 4.x client's list, the broker falls
> > > back to traditional fetch.
> > >
> > > For the specific consumer-side questions:
> > >
> > > 1. Plugin system: We introduce RemoteStorageFetcher [4], a read-only
> > > interface similar to RemoteStorageManager on the broker side. Plugin
> > > matching is handled implicitly via SupportedStorageFormatVersions - if
> > > format versions don't align, the broker falls back to traditional fetch.
> > >
> > > 2. Cost & performance: The broker provides byte position hints derived 
> > > from
> > > the OffsetIndex, and consumers request only the needed range via
> > > startPosition/endPosition in RemoteStorageFetcher.fetchLogSegment(). This
> > > enables byte-range GETs for S3-compatible systems. For backends that don't
> > > support range requests, the plugin implementation would handle buffering -
> > > this is implementation-specific and outside the KIP scope.
> > >
> > > 3. Fallbacks: Yes, covered in the Fallback section [5]. The consumer falls
> > > back to broker-mediated fetch on: timeout, connection failure, auth
> > > failure, or if RemoteStorageFetcher is not configured.
> > >
> > > Let us know if you'd like more detail on any of these.
> > >
> > > Thanks,
> > > Tom
> > >
> > >
> > >
> > > [1]
> > >
> > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-1254%3A+Kafka+Consumer+Support+for+Remote+Tiered+Storage+Fetch
> > > [2]
> > >
> > > https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=399279678#KIP1254:KafkaConsumerSupportforRemoteTieredStorageFetch-VersionCompatibility
> > >
> > > [3]
> > >
> > > https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=399279678#KIP1254:KafkaConsumerSupportforRemoteTieredStorageFetch-ConsumerConfigs
> > >
> > > [4]
> > >
> > > https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=399279678#KIP1254:KafkaConsumerSupportforRemoteTieredStorageFetch-RemoteStorageFetcher
> > > [5]
> > >
> > > https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=399279678#KIP1254:KafkaConsumerSupportforRemoteTieredStorageFetch-Fallback
> > >
> > >
> > > On Tue, Dec 2, 2025 at 12:03 PM Stanislav Kozlovski <
> > > [email protected]> wrote:
> > >
> > > > Hey Henry, thanks for the KIP! I'm excited to see this proposal as I've
> > > > heard it be discussed privately before too.
> > > >
> > > > Can we have some wording that talks about the trade-offs of coupling
> > > > clients to the underlying storage format? Today, the underlying segment
> > > > format is decoupled from the clients, since the broker handles 
> > > > conversion
> > > > of log messages to what the protocol expects. I'm sure certain
> > > proprietary
> > > > Kafka implementations use different formats for their underlying storage
> > > -
> > > > it's an interesting question how they would handle this (to be explicit,
> > > > I'm not proposing we should cater our design to those systems though,
> > > > simply calling it out as a potential contention point).
> > > >
> > > > Things I'm thinking about:
> > > > - Would this be a optional feature?
> > > > - How would forward-compatibility look like?
> > > >
> > > > e.g if we ever want to switch the underlying storage format? To
> > > > bullet-proof ourselves, do we want to introduce some version matching
> > > which
> > > > could then help us understand non-compatibility and throw errors? (e.g 
> > > > we
> > > > change storage format in 6.x, and a 4.x client tries to read from a 6.x
> > > > broker/storage-foramt)
> > > >
> > > > Can we also have some wording on how this feature would look like on the
> > > > consumer-side? The proposal right now suggests we handle this in a
> > > > follow-up KIP, which makes sense for the details - but what about a
> > > > high-level overview and motivation?
> > > >
> > > > 1. We would likely need a similar plugin system for Consumers like
> > > brokers
> > > > have for KIP-405. Getting that interface right would be important.
> > > Ensuring
> > > > the plugin configured on the consumer matches the plugin configured on
> > > the
> > > > broker would be useful from a UX point of view too.
> > > >
> > > > 2. From a cost and performance perspective, how do we envision this 
> > > > being
> > > > used/configured on the consumer side?
> > > >
> > > > A single segment could be GBs of size. It's unlikely a consumer would
> > > want
> > > > to download the whole thing at once.
> > > >
> > > > For tiered backends that are S3-compatible cloud object storage systems,
> > > > we could likely use byte-range GETs, thus avoiding reading too much data
> > > > that'll get discarded. Are there concerns with other systems? A few 
> > > > words
> > > > on this topic would help imo.
> > > >
> > > > 3. Should we have fall-backs to the current behavior?
> > > >
> > > > Best,
> > > > Stan
> > > >
> > > > On 2025/12/02 11:04:13 Kamal Chandraprakash wrote:
> > > > > Hi Haiying,
> > > > >
> > > > > Thanks for the KIP!
> > > > >
> > > > > 1. Do you plan to add support for transactional consumers? Currently,
> > > the
> > > > > consumer doesn't return the aborted transaction records to the 
> > > > > handler.
> > > > > 2. To access the remote storage directly, the client might need
> > > > additional
> > > > > certificates / keys. How do you plan to expose those configs on the
> > > > client?
> > > > > 3. Will it support the Queues for Kafka feature KIP-932
> > > > > <
> > > >
> > > https://urldefense.com/v3/__https://cwiki.apache.org/confluence/display/KAFKA/KIP-932*3A*Queues*for*Kafka__;JSsrKw!!DCbAVzZNrAf4!FnaTZ-RleISfnxHyS-2F1lvhDvglTHhW5Yg-cFch2FgGCd0lw2nUJ3gJtd1AqiwMlghMiwLQ7a6aD9KQlvax_GzYJG2eqtg$
> > > > >?
> > > > > And so on.
> > > > >
> > > > > --
> > > > > Kamal
> > > > >
> > > > > On Tue, Dec 2, 2025 at 10:29 AM Haiying Cai via dev <
> > > > [email protected]>
> > > > > wrote:
> > > > >
> > > > > > For some reason, the KIP link was truncated in the original email.
> > > > Here
> > > > > > is the link again:
> > > > > >
> > > > > > KIP:
> > > > > >
> > > >
> > > https://urldefense.com/v3/__https://cwiki.apache.org/confluence/display/KAFKA/KIP-1248*3A*Allow*consumer*to*fetch*from*remote*tiered*storage__;JSsrKysrKysr!!DCbAVzZNrAf4!FnaTZ-RleISfnxHyS-2F1lvhDvglTHhW5Yg-cFch2FgGCd0lw2nUJ3gJtd1AqiwMlghMiwLQ7a6aD9KQlvax_GzYdpe2QXU$
> > > > > >
> > > > > > Henry Haiying Cai
> > > > > >
> > > > > > On 2025/12/02 04:34:39 Henry Haiying Cai via dev wrote:
> > > > > > >
> > > > > > > Hi all,
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > I would like to start discussion on KIP-1248: Allow consumer to
> > > fetch
> > > > > > from remote tiered storage
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > KIP link: KIP-1248: Allow consumer to fetch from remote tiered
> > > > storage -
> > > > > > Apache Kafka - Apache Software Foundation
> > > > > > >
> > > > > > > |
> > > > > > > |
> > > > > > > |  |
> > > > > > > KIP-1248: Allow consumer to fetch from remote tiered storage -
> > > Apache
> > > > > > Ka...
> > > > > > >
> > > > > > >
> > > > > > >  |
> > > > > > >
> > > > > > >  |
> > > > > > >
> > > > > > >  |
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > The KIP proposes to allow consumer clients to fetch from remote
> > > > tiered
> > > > > > storage directly to avoid hitting broker's network capacity and 
> > > > > > cache
> > > > > > performance.  This is very useful to serve large backfill requests
> > > > from a
> > > > > > new or fallen-off consumer.
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > Any feedback is appreciated.
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > Best regards,
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > Henry
> > > > >
> > > >
> > >
>

Re: [DISCUSS] KIP-1248: Allow consumer to fetch from remote tiered storage

Reply via email to