Hi Jian,

I went over the PR: https://github.com/apache/kafka/pull/20913 and provided
first pass comments.
Shall we avoid the config value deliverables?

1. deriving the value of `remote.copy.lag.ms` from `local.retention.ms` and
2. deriving the value of `remote.copy.lag.bytes` from
`local.retention.bytes`.

The remote copy lag config can be configured at both broker/topic level
which should be sufficient.
Derivable config values are not clear for the user and add to the
operational complexity.

Thanks,
Kamal

On Wed, Mar 18, 2026 at 4:57 PM jian fu <[email protected]> wrote:

> Hi  Chia-Ping:
>
> Thanks for your review and comments.
> Q1:    Is this broker-level configuration dynamic?
> The configures are broker-level configuration dynamic similar as others . I
> already updated the KIP content. Thanks for your reminder.
> Q2:   should we add a metric to track the 'delayed size'?
> Currently, we do have a way to measure how much delay there is. Although
> it’s not very convenient, Thus I think in most cases there isn’t a need to
> continuously monitor this delay . In addition, we can leverage the API
> introduced in KIP-1187 (Support to retrieve remote log size via
> DescribeLogDirs RPC) to query it directly in future. So, given that this
> requirement is not particularly urgent, I think we can hold off on adding a
> metric for now.
>
> Thanks for your comments!
>
> Regards
> Jian
>
>
> Chia-Ping Tsai <[email protected]> 于2026年3月18日周三 18:43写道:
>
> > hi Jian
> >
> > sorry for late review. There are some questions below.
> >
> > Is this broker-level configuration dynamic? We should clarify this in the
> > KIP. Also, should we add a metric to track the 'delayed size'?
> >
> > Best,
> > Chia-Ping
> >
> > On 2025/11/19 13:29:11 jian fu wrote:
> > > Hi everyone, I'd like to start a discussion on KIP-1241, the goal is to
> > > reduce the remote storage. KIP:
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1241%3A+Reduce+tiered+storage+redundancy+with+delayed+upload
> > >
> > > The Draft PR:   https://github.com/apache/kafka/pull/20913    Problem:
> > > Currently,
> > > Kafka's tiered storage implementation uploads all non-active local log
> > > segments to remote storage immediately, even when they are still within
> > the
> > > local retention period.
> > > This results in redundant storage of the same data in both local and
> > remote
> > > tiers.
> > >
> > > When there is no requirement for real-time analytics or immediate
> > > consumption based on remote storage. It has the following drawbacks:
> > >
> > > 1. Wastes storage capacity and costs: The same data is stored twice
> > during
> > > the local retention window
> > > 2. Provides no immediate benefit: During the local retention period,
> > reads
> > > prioritize local data, making the remote copy unnecessary
> > >
> > >
> > > So. this KIP is to reduce tiered storage redundancy with delayed
> upload.
> > > You can check the test result example here directly:
> > > https://github.com/apache/kafka/pull/20913#issuecomment-3547156286
> > > Looking forward to your feedback! Best regards, Jian
> > >
> >
>

Reply via email to