Re: [DISCUSS] KIP-917: Additional custom metadata for remote log segment

2023-07-10 Thread Ivan Yurchenko
;> plugin
>> > > > >> store its own metadata separately with a solution chosen by the
>> admin
>> > > > >> or plugin provider. For instance, it could be using a dedicated
>> topic
>> > > > >> if chosen to, or relying on an external key-value store.
>> > > > >>
>> > > > >> I agree with you on the existing risks associated with running
>> > > > >> third-party code inside Apache Kafka. That said, combining custom
>> > > > >> metadata with rlmMetadata increases coupling between Kafka and
>> the
>> > > > >> plugin. For instance, the custom metadata may need to be modified
>> > > > >> outside of Kafka, but the rlmMetadata would still be cached on
>> brokers
>> > > > >> independently of any update of custom metadata. Since both types
>> of
>> > > > >> metadata are authored by different systems, and are cached in
>> > > > >> different layers, this may become a problem, or make plugin
>> migration
>> > > > >> more difficult. What do you think?
>> > > > >>
>> > > > >> I have a vague memory of this being discussed back when the
>> tiered
>> > > > >> storage KIP was started. Maybe Satish has more background on
>> this.
>> > > > >>
>> > > > >> Thanks,
>> > > > >> Alexandre
>> > > > >>
>> > > > >> Le lun. 17 avr. 2023 à 16:50, Ivan Yurchenko
>> > > > >>  a écrit :
>> > > > >> >
>> > > > >> > Hi Alexandre,
>> > > > >> >
>> > > > >> > Thank you for your feedback!
>> > > > >> >
>> > > > >> > > One question I would have is, what is the benefit of adding
>> these
>> > > > >> > > custom metadata in the rlmMetadata rather than letting the
>> plugin
>> > > > >> > > manage access and persistence to them?
>> > > > >> >
>> > > > >> > Could you please elaborate? Do I understand correctly that the
>> idea
>> > > is
>> > > > >> that
>> > > > >> > the plugin will have its own storage for those custom
>> metadata, for
>> > > > >> example
>> > > > >> > a special topic?
>> > > > >> >
>> > > > >> > > It would be possible for a user
>> > > > >> > > to use custom metadata large enough to adversely impact
>> access to
>> > > and
>> > > > >> > > caching of the rlmMetadata by Kafka.
>> > > > >> >
>> > > > >> > Since the custom metadata is 100% under control of the RSM
>> plugin,
>> > > the
>> > > > >> risk
>> > > > >> > is as big as the risk of running a third-party code (i.e. the
>> RSM
>> > > > >> plugin).
>> > > > >> > The cluster admin must make the decision if they trust it.
>> > > > >> > To mitigate this risk and put it under control, the RSM plugin
>> > > > >> > implementations could document what custom metadata they use
>> and
>> > > > >> estimate
>> > > > >> > their size.
>> > > > >> >
>> > > > >> > Best,
>> > > > >> > Ivan
>> > > > >> >
>> > > > >> >
>> > > > >> > On Mon, 17 Apr 2023 at 18:14, Alexandre Dupriez <
>> > > > >> alexandre.dupr...@gmail.com>
>> > > > >> > wrote:
>> > > > >> >
>> > > > >> > > Hi Ivan,
>> > > > >> > >
>> > > > >> > > Thank you for the KIP.
>> > > > >> > >
>> > > > >> > > I think the KIP clearly explains the need for out-of-band
>> metadata
>> > > > >> > > authored and used by an implementation of the remote storage
>> > > manager.
>> > > > >> > > One question I would have is, what is the benefit of adding
>> these
>> > > > >> > > custom metadata in the rlmMetadata rather than letting the
>> plugin
>> > > > >> > > manage access and persistence to them?
>> > > > >> > >
>> > > > >> > > Maybe one disadvantage and potential risk with the approach
>> > > proposed
>> > > > >> > > in the KIP is that the rlmMetadata is not of a predefined,
>> > > relatively
>> > > > >> > > constant size (although corner cases with thousands of leader
>> > > epochs
>> > > > >> > > in the leader epoch map are possible). It would be possible
>> for a
>> > > user
>> > > > >> > > to use custom metadata large enough to adversely impact
>> access to
>> > > and
>> > > > >> > > caching of the rlmMetadata by Kafka.
>> > > > >> > >
>> > > > >> > > Thanks,
>> > > > >> > > Alexandre
>> > > > >> > >
>> > > > >> > > Le jeu. 6 avr. 2023 à 16:03, hzh0425  a
>> écrit :
>> > > > >> > > >
>> > > > >> > > > I think it's a good idea as we may want to store remote
>> > > segments in
>> > > > >> > > different buckets
>> > > > >> > > >
>> > > > >> > > >
>> > > > >> > > >
>> > > > >> > > > | |
>> > > > >> > > > hzhka...@163.com
>> > > > >> > > > |
>> > > > >> > > > |
>> > > > >> > > > 邮箱:hzhka...@163.com
>> > > > >> > > > |
>> > > > >> > > >
>> > > > >> > > >
>> > > > >> > > >
>> > > > >> > > >
>> > > > >> > > >  回复的原邮件 
>> > > > >> > > > | 发件人 | Ivan Yurchenko |
>> > > > >> > > > | 日期 | 2023年04月06日 22:37 |
>> > > > >> > > > | 收件人 | dev@kafka.apache.org |
>> > > > >> > > > | 抄送至 | |
>> > > > >> > > > | 主题 | [DISCUSS] KIP-917: Additional custom metadata for
>> remote
>> > > log
>> > > > >> > > segment |
>> > > > >> > > > Hello!
>> > > > >> > > >
>> > > > >> > > > I would like to start the discussion thread on KIP-917:
>> > > Additional
>> > > > >> custom
>> > > > >> > > > metadata for remote log segment [1]
>> > > > >> > > > This KIP is fairly small and proposes to add a new field
>> to the
>> > > > >> remote
>> > > > >> > > > segment metadata.
>> > > > >> > > >
>> > > > >> > > > Thank you!
>> > > > >> > > >
>> > > > >> > > > Best,
>> > > > >> > > > Ivan
>> > > > >> > > >
>> > > > >> > > > [1]
>> > > > >> > > >
>> > > > >> > >
>> > > > >>
>> > >
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-917%3A+Additional+custom+metadata+for+remote+log+segment
>> > > > >> > >
>> > > > >>
>> > > > >
>> > >
>>
>


Re: [DISCUSS] KIP-917: Additional custom metadata for remote log segment

2023-06-13 Thread Ivan Yurchenko
ake plugin
> migration
> > > > >> more difficult. What do you think?
> > > > >>
> > > > >> I have a vague memory of this being discussed back when the tiered
> > > > >> storage KIP was started. Maybe Satish has more background on this.
> > > > >>
> > > > >> Thanks,
> > > > >> Alexandre
> > > > >>
> > > > >> Le lun. 17 avr. 2023 à 16:50, Ivan Yurchenko
> > > > >>  a écrit :
> > > > >> >
> > > > >> > Hi Alexandre,
> > > > >> >
> > > > >> > Thank you for your feedback!
> > > > >> >
> > > > >> > > One question I would have is, what is the benefit of adding
> these
> > > > >> > > custom metadata in the rlmMetadata rather than letting the
> plugin
> > > > >> > > manage access and persistence to them?
> > > > >> >
> > > > >> > Could you please elaborate? Do I understand correctly that the
> idea
> > > is
> > > > >> that
> > > > >> > the plugin will have its own storage for those custom metadata,
> for
> > > > >> example
> > > > >> > a special topic?
> > > > >> >
> > > > >> > > It would be possible for a user
> > > > >> > > to use custom metadata large enough to adversely impact
> access to
> > > and
> > > > >> > > caching of the rlmMetadata by Kafka.
> > > > >> >
> > > > >> > Since the custom metadata is 100% under control of the RSM
> plugin,
> > > the
> > > > >> risk
> > > > >> > is as big as the risk of running a third-party code (i.e. the
> RSM
> > > > >> plugin).
> > > > >> > The cluster admin must make the decision if they trust it.
> > > > >> > To mitigate this risk and put it under control, the RSM plugin
> > > > >> > implementations could document what custom metadata they use and
> > > > >> estimate
> > > > >> > their size.
> > > > >> >
> > > > >> > Best,
> > > > >> > Ivan
> > > > >> >
> > > > >> >
> > > > >> > On Mon, 17 Apr 2023 at 18:14, Alexandre Dupriez <
> > > > >> alexandre.dupr...@gmail.com>
> > > > >> > wrote:
> > > > >> >
> > > > >> > > Hi Ivan,
> > > > >> > >
> > > > >> > > Thank you for the KIP.
> > > > >> > >
> > > > >> > > I think the KIP clearly explains the need for out-of-band
> metadata
> > > > >> > > authored and used by an implementation of the remote storage
> > > manager.
> > > > >> > > One question I would have is, what is the benefit of adding
> these
> > > > >> > > custom metadata in the rlmMetadata rather than letting the
> plugin
> > > > >> > > manage access and persistence to them?
> > > > >> > >
> > > > >> > > Maybe one disadvantage and potential risk with the approach
> > > proposed
> > > > >> > > in the KIP is that the rlmMetadata is not of a predefined,
> > > relatively
> > > > >> > > constant size (although corner cases with thousands of leader
> > > epochs
> > > > >> > > in the leader epoch map are possible). It would be possible
> for a
> > > user
> > > > >> > > to use custom metadata large enough to adversely impact
> access to
> > > and
> > > > >> > > caching of the rlmMetadata by Kafka.
> > > > >> > >
> > > > >> > > Thanks,
> > > > >> > > Alexandre
> > > > >> > >
> > > > >> > > Le jeu. 6 avr. 2023 à 16:03, hzh0425  a
> écrit :
> > > > >> > > >
> > > > >> > > > I think it's a good idea as we may want to store remote
> > > segments in
> > > > >> > > different buckets
> > > > >> > > >
> > > > >> > > >
> > > > >> > > >
> > > > >> > > > | |
> > > > >> > > > hzhka...@163.com
> > > > >> > > > |
> > > > >> > > > |
> > > > >> > > > 邮箱:hzhka...@163.com
> > > > >> > > > |
> > > > >> > > >
> > > > >> > > >
> > > > >> > > >
> > > > >> > > >
> > > > >> > > >  回复的原邮件 
> > > > >> > > > | 发件人 | Ivan Yurchenko |
> > > > >> > > > | 日期 | 2023年04月06日 22:37 |
> > > > >> > > > | 收件人 | dev@kafka.apache.org |
> > > > >> > > > | 抄送至 | |
> > > > >> > > > | 主题 | [DISCUSS] KIP-917: Additional custom metadata for
> remote
> > > log
> > > > >> > > segment |
> > > > >> > > > Hello!
> > > > >> > > >
> > > > >> > > > I would like to start the discussion thread on KIP-917:
> > > Additional
> > > > >> custom
> > > > >> > > > metadata for remote log segment [1]
> > > > >> > > > This KIP is fairly small and proposes to add a new field to
> the
> > > > >> remote
> > > > >> > > > segment metadata.
> > > > >> > > >
> > > > >> > > > Thank you!
> > > > >> > > >
> > > > >> > > > Best,
> > > > >> > > > Ivan
> > > > >> > > >
> > > > >> > > > [1]
> > > > >> > > >
> > > > >> > >
> > > > >>
> > >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-917%3A+Additional+custom+metadata+for+remote+log+segment
> > > > >> > >
> > > > >>
> > > > >
> > >
>


Re: [DISCUSS] KIP-917: Additional custom metadata for remote log segment

2023-06-12 Thread Satish Duggana
adversely impact access to
> > and
> > > >> > > caching of the rlmMetadata by Kafka.
> > > >> >
> > > >> > Since the custom metadata is 100% under control of the RSM plugin,
> > the
> > > >> risk
> > > >> > is as big as the risk of running a third-party code (i.e. the RSM
> > > >> plugin).
> > > >> > The cluster admin must make the decision if they trust it.
> > > >> > To mitigate this risk and put it under control, the RSM plugin
> > > >> > implementations could document what custom metadata they use and
> > > >> estimate
> > > >> > their size.
> > > >> >
> > > >> > Best,
> > > >> > Ivan
> > > >> >
> > > >> >
> > > >> > On Mon, 17 Apr 2023 at 18:14, Alexandre Dupriez <
> > > >> alexandre.dupr...@gmail.com>
> > > >> > wrote:
> > > >> >
> > > >> > > Hi Ivan,
> > > >> > >
> > > >> > > Thank you for the KIP.
> > > >> > >
> > > >> > > I think the KIP clearly explains the need for out-of-band metadata
> > > >> > > authored and used by an implementation of the remote storage
> > manager.
> > > >> > > One question I would have is, what is the benefit of adding these
> > > >> > > custom metadata in the rlmMetadata rather than letting the plugin
> > > >> > > manage access and persistence to them?
> > > >> > >
> > > >> > > Maybe one disadvantage and potential risk with the approach
> > proposed
> > > >> > > in the KIP is that the rlmMetadata is not of a predefined,
> > relatively
> > > >> > > constant size (although corner cases with thousands of leader
> > epochs
> > > >> > > in the leader epoch map are possible). It would be possible for a
> > user
> > > >> > > to use custom metadata large enough to adversely impact access to
> > and
> > > >> > > caching of the rlmMetadata by Kafka.
> > > >> > >
> > > >> > > Thanks,
> > > >> > > Alexandre
> > > >> > >
> > > >> > > Le jeu. 6 avr. 2023 à 16:03, hzh0425  a écrit :
> > > >> > > >
> > > >> > > > I think it's a good idea as we may want to store remote
> > segments in
> > > >> > > different buckets
> > > >> > > >
> > > >> > > >
> > > >> > > >
> > > >> > > > | |
> > > >> > > > hzhka...@163.com
> > > >> > > > |
> > > >> > > > |
> > > >> > > > 邮箱:hzhka...@163.com
> > > >> > > > |
> > > >> > > >
> > > >> > > >
> > > >> > > >
> > > >> > > >
> > > >> > > >  回复的原邮件 
> > > >> > > > | 发件人 | Ivan Yurchenko |
> > > >> > > > | 日期 | 2023年04月06日 22:37 |
> > > >> > > > | 收件人 | dev@kafka.apache.org |
> > > >> > > > | 抄送至 | |
> > > >> > > > | 主题 | [DISCUSS] KIP-917: Additional custom metadata for remote
> > log
> > > >> > > segment |
> > > >> > > > Hello!
> > > >> > > >
> > > >> > > > I would like to start the discussion thread on KIP-917:
> > Additional
> > > >> custom
> > > >> > > > metadata for remote log segment [1]
> > > >> > > > This KIP is fairly small and proposes to add a new field to the
> > > >> remote
> > > >> > > > segment metadata.
> > > >> > > >
> > > >> > > > Thank you!
> > > >> > > >
> > > >> > > > Best,
> > > >> > > > Ivan
> > > >> > > >
> > > >> > > > [1]
> > > >> > > >
> > > >> > >
> > > >>
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-917%3A+Additional+custom+metadata+for+remote+log+segment
> > > >> > >
> > > >>
> > > >
> >


Re: [DISCUSS] KIP-917: Additional custom metadata for remote log segment

2023-06-12 Thread Ivan Yurchenko
 > side
>> > > > > for the standard remote metadata. I'd like to avoid this and this
>> KIP
>> > > is
>> > > > > the best solution I see.
>> > > > >
>> > > > > Best,
>> > > > > Ivan
>> > > > >
>> > > > >
>> > > > >
>> > > > > On Tue, 18 Apr 2023 at 13:02, Alexandre Dupriez <
>> > > > > alexandre.dupr...@gmail.com> wrote:
>> > > > >
>> > > > >> Hi Ivan,
>> > > > >>
>> > > > >> Thanks for the follow-up.
>> > > > >>
>> > > > >> Yes, you are right that the suggested alternative is to let the
>> plugin
>> > > > >> store its own metadata separately with a solution chosen by the
>> admin
>> > > > >> or plugin provider. For instance, it could be using a dedicated
>> topic
>> > > > >> if chosen to, or relying on an external key-value store.
>> > > > >>
>> > > > >> I agree with you on the existing risks associated with running
>> > > > >> third-party code inside Apache Kafka. That said, combining custom
>> > > > >> metadata with rlmMetadata increases coupling between Kafka and
>> the
>> > > > >> plugin. For instance, the custom metadata may need to be modified
>> > > > >> outside of Kafka, but the rlmMetadata would still be cached on
>> brokers
>> > > > >> independently of any update of custom metadata. Since both types
>> of
>> > > > >> metadata are authored by different systems, and are cached in
>> > > > >> different layers, this may become a problem, or make plugin
>> migration
>> > > > >> more difficult. What do you think?
>> > > > >>
>> > > > >> I have a vague memory of this being discussed back when the
>> tiered
>> > > > >> storage KIP was started. Maybe Satish has more background on
>> this.
>> > > > >>
>> > > > >> Thanks,
>> > > > >> Alexandre
>> > > > >>
>> > > > >> Le lun. 17 avr. 2023 à 16:50, Ivan Yurchenko
>> > > > >>  a écrit :
>> > > > >> >
>> > > > >> > Hi Alexandre,
>> > > > >> >
>> > > > >> > Thank you for your feedback!
>> > > > >> >
>> > > > >> > > One question I would have is, what is the benefit of adding
>> these
>> > > > >> > > custom metadata in the rlmMetadata rather than letting the
>> plugin
>> > > > >> > > manage access and persistence to them?
>> > > > >> >
>> > > > >> > Could you please elaborate? Do I understand correctly that the
>> idea
>> > > is
>> > > > >> that
>> > > > >> > the plugin will have its own storage for those custom
>> metadata, for
>> > > > >> example
>> > > > >> > a special topic?
>> > > > >> >
>> > > > >> > > It would be possible for a user
>> > > > >> > > to use custom metadata large enough to adversely impact
>> access to
>> > > and
>> > > > >> > > caching of the rlmMetadata by Kafka.
>> > > > >> >
>> > > > >> > Since the custom metadata is 100% under control of the RSM
>> plugin,
>> > > the
>> > > > >> risk
>> > > > >> > is as big as the risk of running a third-party code (i.e. the
>> RSM
>> > > > >> plugin).
>> > > > >> > The cluster admin must make the decision if they trust it.
>> > > > >> > To mitigate this risk and put it under control, the RSM plugin
>> > > > >> > implementations could document what custom metadata they use
>> and
>> > > > >> estimate
>> > > > >> > their size.
>> > > > >> >
>> > > > >> > Best,
>> > > > >> > Ivan
>> > > > >> >
>> > > > >> >
>> > > > >> > On Mon, 17 Apr 2023 at 18:14, Alexandre Dupriez <
>> > > > >> alexandre.dupr...@gmail.com>
>> > > > >> > wrote:
>> > > > >> >
>> > > > >> > > Hi Ivan,
>> > > > >> > >
>> > > > >> > > Thank you for the KIP.
>> > > > >> > >
>> > > > >> > > I think the KIP clearly explains the need for out-of-band
>> metadata
>> > > > >> > > authored and used by an implementation of the remote storage
>> > > manager.
>> > > > >> > > One question I would have is, what is the benefit of adding
>> these
>> > > > >> > > custom metadata in the rlmMetadata rather than letting the
>> plugin
>> > > > >> > > manage access and persistence to them?
>> > > > >> > >
>> > > > >> > > Maybe one disadvantage and potential risk with the approach
>> > > proposed
>> > > > >> > > in the KIP is that the rlmMetadata is not of a predefined,
>> > > relatively
>> > > > >> > > constant size (although corner cases with thousands of leader
>> > > epochs
>> > > > >> > > in the leader epoch map are possible). It would be possible
>> for a
>> > > user
>> > > > >> > > to use custom metadata large enough to adversely impact
>> access to
>> > > and
>> > > > >> > > caching of the rlmMetadata by Kafka.
>> > > > >> > >
>> > > > >> > > Thanks,
>> > > > >> > > Alexandre
>> > > > >> > >
>> > > > >> > > Le jeu. 6 avr. 2023 à 16:03, hzh0425  a
>> écrit :
>> > > > >> > > >
>> > > > >> > > > I think it's a good idea as we may want to store remote
>> > > segments in
>> > > > >> > > different buckets
>> > > > >> > > >
>> > > > >> > > >
>> > > > >> > > >
>> > > > >> > > > | |
>> > > > >> > > > hzhka...@163.com
>> > > > >> > > > |
>> > > > >> > > > |
>> > > > >> > > > 邮箱:hzhka...@163.com
>> > > > >> > > > |
>> > > > >> > > >
>> > > > >> > > >
>> > > > >> > > >
>> > > > >> > > >
>> > > > >> > > >  回复的原邮件 
>> > > > >> > > > | 发件人 | Ivan Yurchenko |
>> > > > >> > > > | 日期 | 2023年04月06日 22:37 |
>> > > > >> > > > | 收件人 | dev@kafka.apache.org |
>> > > > >> > > > | 抄送至 | |
>> > > > >> > > > | 主题 | [DISCUSS] KIP-917: Additional custom metadata for
>> remote
>> > > log
>> > > > >> > > segment |
>> > > > >> > > > Hello!
>> > > > >> > > >
>> > > > >> > > > I would like to start the discussion thread on KIP-917:
>> > > Additional
>> > > > >> custom
>> > > > >> > > > metadata for remote log segment [1]
>> > > > >> > > > This KIP is fairly small and proposes to add a new field
>> to the
>> > > > >> remote
>> > > > >> > > > segment metadata.
>> > > > >> > > >
>> > > > >> > > > Thank you!
>> > > > >> > > >
>> > > > >> > > > Best,
>> > > > >> > > > Ivan
>> > > > >> > > >
>> > > > >> > > > [1]
>> > > > >> > > >
>> > > > >> > >
>> > > > >>
>> > >
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-917%3A+Additional+custom+metadata+for+remote+log+segment
>> > > > >> > >
>> > > > >>
>> > > > >
>> > >
>>
>


Re: [DISCUSS] KIP-917: Additional custom metadata for remote log segment

2023-06-12 Thread Ivan Yurchenko
gt; > >>
> > > > >> I agree with you on the existing risks associated with running
> > > > >> third-party code inside Apache Kafka. That said, combining custom
> > > > >> metadata with rlmMetadata increases coupling between Kafka and the
> > > > >> plugin. For instance, the custom metadata may need to be modified
> > > > >> outside of Kafka, but the rlmMetadata would still be cached on
> brokers
> > > > >> independently of any update of custom metadata. Since both types
> of
> > > > >> metadata are authored by different systems, and are cached in
> > > > >> different layers, this may become a problem, or make plugin
> migration
> > > > >> more difficult. What do you think?
> > > > >>
> > > > >> I have a vague memory of this being discussed back when the tiered
> > > > >> storage KIP was started. Maybe Satish has more background on this.
> > > > >>
> > > > >> Thanks,
> > > > >> Alexandre
> > > > >>
> > > > >> Le lun. 17 avr. 2023 à 16:50, Ivan Yurchenko
> > > > >>  a écrit :
> > > > >> >
> > > > >> > Hi Alexandre,
> > > > >> >
> > > > >> > Thank you for your feedback!
> > > > >> >
> > > > >> > > One question I would have is, what is the benefit of adding
> these
> > > > >> > > custom metadata in the rlmMetadata rather than letting the
> plugin
> > > > >> > > manage access and persistence to them?
> > > > >> >
> > > > >> > Could you please elaborate? Do I understand correctly that the
> idea
> > > is
> > > > >> that
> > > > >> > the plugin will have its own storage for those custom metadata,
> for
> > > > >> example
> > > > >> > a special topic?
> > > > >> >
> > > > >> > > It would be possible for a user
> > > > >> > > to use custom metadata large enough to adversely impact
> access to
> > > and
> > > > >> > > caching of the rlmMetadata by Kafka.
> > > > >> >
> > > > >> > Since the custom metadata is 100% under control of the RSM
> plugin,
> > > the
> > > > >> risk
> > > > >> > is as big as the risk of running a third-party code (i.e. the
> RSM
> > > > >> plugin).
> > > > >> > The cluster admin must make the decision if they trust it.
> > > > >> > To mitigate this risk and put it under control, the RSM plugin
> > > > >> > implementations could document what custom metadata they use and
> > > > >> estimate
> > > > >> > their size.
> > > > >> >
> > > > >> > Best,
> > > > >> > Ivan
> > > > >> >
> > > > >> >
> > > > >> > On Mon, 17 Apr 2023 at 18:14, Alexandre Dupriez <
> > > > >> alexandre.dupr...@gmail.com>
> > > > >> > wrote:
> > > > >> >
> > > > >> > > Hi Ivan,
> > > > >> > >
> > > > >> > > Thank you for the KIP.
> > > > >> > >
> > > > >> > > I think the KIP clearly explains the need for out-of-band
> metadata
> > > > >> > > authored and used by an implementation of the remote storage
> > > manager.
> > > > >> > > One question I would have is, what is the benefit of adding
> these
> > > > >> > > custom metadata in the rlmMetadata rather than letting the
> plugin
> > > > >> > > manage access and persistence to them?
> > > > >> > >
> > > > >> > > Maybe one disadvantage and potential risk with the approach
> > > proposed
> > > > >> > > in the KIP is that the rlmMetadata is not of a predefined,
> > > relatively
> > > > >> > > constant size (although corner cases with thousands of leader
> > > epochs
> > > > >> > > in the leader epoch map are possible). It would be possible
> for a
> > > user
> > > > >> > > to use custom metadata large enough to adversely impact
> access to
> > > and
> > > > >> > > caching of the rlmMetadata by Kafka.
> > > > >> > >
> > > > >> > > Thanks,
> > > > >> > > Alexandre
> > > > >> > >
> > > > >> > > Le jeu. 6 avr. 2023 à 16:03, hzh0425  a
> écrit :
> > > > >> > > >
> > > > >> > > > I think it's a good idea as we may want to store remote
> > > segments in
> > > > >> > > different buckets
> > > > >> > > >
> > > > >> > > >
> > > > >> > > >
> > > > >> > > > | |
> > > > >> > > > hzhka...@163.com
> > > > >> > > > |
> > > > >> > > > |
> > > > >> > > > 邮箱:hzhka...@163.com
> > > > >> > > > |
> > > > >> > > >
> > > > >> > > >
> > > > >> > > >
> > > > >> > > >
> > > > >> > > >  回复的原邮件 
> > > > >> > > > | 发件人 | Ivan Yurchenko |
> > > > >> > > > | 日期 | 2023年04月06日 22:37 |
> > > > >> > > > | 收件人 | dev@kafka.apache.org |
> > > > >> > > > | 抄送至 | |
> > > > >> > > > | 主题 | [DISCUSS] KIP-917: Additional custom metadata for
> remote
> > > log
> > > > >> > > segment |
> > > > >> > > > Hello!
> > > > >> > > >
> > > > >> > > > I would like to start the discussion thread on KIP-917:
> > > Additional
> > > > >> custom
> > > > >> > > > metadata for remote log segment [1]
> > > > >> > > > This KIP is fairly small and proposes to add a new field to
> the
> > > > >> remote
> > > > >> > > > segment metadata.
> > > > >> > > >
> > > > >> > > > Thank you!
> > > > >> > > >
> > > > >> > > > Best,
> > > > >> > > > Ivan
> > > > >> > > >
> > > > >> > > > [1]
> > > > >> > > >
> > > > >> > >
> > > > >>
> > >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-917%3A+Additional+custom+metadata+for+remote+log+segment
> > > > >> > >
> > > > >>
> > > > >
> > >
>


Re: [DISCUSS] KIP-917: Additional custom metadata for remote log segment

2023-06-12 Thread Luke Chen
 to use custom metadata large enough to adversely impact access to
> > and
> > > >> > > caching of the rlmMetadata by Kafka.
> > > >> >
> > > >> > Since the custom metadata is 100% under control of the RSM plugin,
> > the
> > > >> risk
> > > >> > is as big as the risk of running a third-party code (i.e. the RSM
> > > >> plugin).
> > > >> > The cluster admin must make the decision if they trust it.
> > > >> > To mitigate this risk and put it under control, the RSM plugin
> > > >> > implementations could document what custom metadata they use and
> > > >> estimate
> > > >> > their size.
> > > >> >
> > > >> > Best,
> > > >> > Ivan
> > > >> >
> > > >> >
> > > >> > On Mon, 17 Apr 2023 at 18:14, Alexandre Dupriez <
> > > >> alexandre.dupr...@gmail.com>
> > > >> > wrote:
> > > >> >
> > > >> > > Hi Ivan,
> > > >> > >
> > > >> > > Thank you for the KIP.
> > > >> > >
> > > >> > > I think the KIP clearly explains the need for out-of-band metadata
> > > >> > > authored and used by an implementation of the remote storage
> > manager.
> > > >> > > One question I would have is, what is the benefit of adding these
> > > >> > > custom metadata in the rlmMetadata rather than letting the plugin
> > > >> > > manage access and persistence to them?
> > > >> > >
> > > >> > > Maybe one disadvantage and potential risk with the approach
> > proposed
> > > >> > > in the KIP is that the rlmMetadata is not of a predefined,
> > relatively
> > > >> > > constant size (although corner cases with thousands of leader
> > epochs
> > > >> > > in the leader epoch map are possible). It would be possible for a
> > user
> > > >> > > to use custom metadata large enough to adversely impact access to
> > and
> > > >> > > caching of the rlmMetadata by Kafka.
> > > >> > >
> > > >> > > Thanks,
> > > >> > > Alexandre
> > > >> > >
> > > >> > > Le jeu. 6 avr. 2023 à 16:03, hzh0425  a écrit :
> > > >> > > >
> > > >> > > > I think it's a good idea as we may want to store remote
> > segments in
> > > >> > > different buckets
> > > >> > > >
> > > >> > > >
> > > >> > > >
> > > >> > > > | |
> > > >> > > > hzhka...@163.com
> > > >> > > > |
> > > >> > > > |
> > > >> > > > 邮箱:hzhka...@163.com
> > > >> > > > |
> > > >> > > >
> > > >> > > >
> > > >> > > >
> > > >> > > >
> > > >> > > >  回复的原邮件 
> > > >> > > > | 发件人 | Ivan Yurchenko |
> > > >> > > > | 日期 | 2023年04月06日 22:37 |
> > > >> > > > | 收件人 | dev@kafka.apache.org |
> > > >> > > > | 抄送至 | |
> > > >> > > > | 主题 | [DISCUSS] KIP-917: Additional custom metadata for remote
> > log
> > > >> > > segment |
> > > >> > > > Hello!
> > > >> > > >
> > > >> > > > I would like to start the discussion thread on KIP-917:
> > Additional
> > > >> custom
> > > >> > > > metadata for remote log segment [1]
> > > >> > > > This KIP is fairly small and proposes to add a new field to the
> > > >> remote
> > > >> > > > segment metadata.
> > > >> > > >
> > > >> > > > Thank you!
> > > >> > > >
> > > >> > > > Best,
> > > >> > > > Ivan
> > > >> > > >
> > > >> > > > [1]
> > > >> > > >
> > > >> > >
> > > >>
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-917%3A+Additional+custom+metadata+for+remote+log+segment
> > > >> > >
> > > >>
> > > >
> >


Re: [DISCUSS] KIP-917: Additional custom metadata for remote log segment

2023-06-07 Thread Ivan Yurchenko
sing a dedicated topic
> > > > if chosen to, or relying on an external key-value store.
> > >
> > > I see. Yes, this option always exists and doesn't even require a KIP.
> The
> > > biggest drawback I see is that a plugin will need to reimplement the
> > > consumer/producer + caching mechanics that will exist on the broker
> side
> > > for the standard remote metadata. I'd like to avoid this and this KIP
> is
> > > the best solution I see.
> > >
> > > Best,
> > > Ivan
> > >
> > >
> > >
> > > On Tue, 18 Apr 2023 at 13:02, Alexandre Dupriez <
> > > alexandre.dupr...@gmail.com> wrote:
> > >
> > >> Hi Ivan,
> > >>
> > >> Thanks for the follow-up.
> > >>
> > >> Yes, you are right that the suggested alternative is to let the plugin
> > >> store its own metadata separately with a solution chosen by the admin
> > >> or plugin provider. For instance, it could be using a dedicated topic
> > >> if chosen to, or relying on an external key-value store.
> > >>
> > >> I agree with you on the existing risks associated with running
> > >> third-party code inside Apache Kafka. That said, combining custom
> > >> metadata with rlmMetadata increases coupling between Kafka and the
> > >> plugin. For instance, the custom metadata may need to be modified
> > >> outside of Kafka, but the rlmMetadata would still be cached on brokers
> > >> independently of any update of custom metadata. Since both types of
> > >> metadata are authored by different systems, and are cached in
> > >> different layers, this may become a problem, or make plugin migration
> > >> more difficult. What do you think?
> > >>
> > >> I have a vague memory of this being discussed back when the tiered
> > >> storage KIP was started. Maybe Satish has more background on this.
> > >>
> > >> Thanks,
> > >> Alexandre
> > >>
> > >> Le lun. 17 avr. 2023 à 16:50, Ivan Yurchenko
> > >>  a écrit :
> > >> >
> > >> > Hi Alexandre,
> > >> >
> > >> > Thank you for your feedback!
> > >> >
> > >> > > One question I would have is, what is the benefit of adding these
> > >> > > custom metadata in the rlmMetadata rather than letting the plugin
> > >> > > manage access and persistence to them?
> > >> >
> > >> > Could you please elaborate? Do I understand correctly that the idea
> is
> > >> that
> > >> > the plugin will have its own storage for those custom metadata, for
> > >> example
> > >> > a special topic?
> > >> >
> > >> > > It would be possible for a user
> > >> > > to use custom metadata large enough to adversely impact access to
> and
> > >> > > caching of the rlmMetadata by Kafka.
> > >> >
> > >> > Since the custom metadata is 100% under control of the RSM plugin,
> the
> > >> risk
> > >> > is as big as the risk of running a third-party code (i.e. the RSM
> > >> plugin).
> > >> > The cluster admin must make the decision if they trust it.
> > >> > To mitigate this risk and put it under control, the RSM plugin
> > >> > implementations could document what custom metadata they use and
> > >> estimate
> > >> > their size.
> > >> >
> > >> > Best,
> > >> > Ivan
> > >> >
> > >> >
> > >> > On Mon, 17 Apr 2023 at 18:14, Alexandre Dupriez <
> > >> alexandre.dupr...@gmail.com>
> > >> > wrote:
> > >> >
> > >> > > Hi Ivan,
> > >> > >
> > >> > > Thank you for the KIP.
> > >> > >
> > >> > > I think the KIP clearly explains the need for out-of-band metadata
> > >> > > authored and used by an implementation of the remote storage
> manager.
> > >> > > One question I would have is, what is the benefit of adding these
> > >> > > custom metadata in the rlmMetadata rather than letting the plugin
> > >> > > manage access and persistence to them?
> > >> > >
> > >> > > Maybe one disadvantage and potential risk with the approach
> proposed
> > >> > > in the KIP is that the rlmMetadata is not of a predefined,
> relatively
> > >> > > constant size (although corner cases with thousands of leader
> epochs
> > >> > > in the leader epoch map are possible). It would be possible for a
> user
> > >> > > to use custom metadata large enough to adversely impact access to
> and
> > >> > > caching of the rlmMetadata by Kafka.
> > >> > >
> > >> > > Thanks,
> > >> > > Alexandre
> > >> > >
> > >> > > Le jeu. 6 avr. 2023 à 16:03, hzh0425  a écrit :
> > >> > > >
> > >> > > > I think it's a good idea as we may want to store remote
> segments in
> > >> > > different buckets
> > >> > > >
> > >> > > >
> > >> > > >
> > >> > > > | |
> > >> > > > hzhka...@163.com
> > >> > > > |
> > >> > > > |
> > >> > > > 邮箱:hzhka...@163.com
> > >> > > > |
> > >> > > >
> > >> > > >
> > >> > > >
> > >> > > >
> > >> > > >  回复的原邮件 
> > >> > > > | 发件人 | Ivan Yurchenko |
> > >> > > > | 日期 | 2023年04月06日 22:37 |
> > >> > > > | 收件人 | dev@kafka.apache.org |
> > >> > > > | 抄送至 | |
> > >> > > > | 主题 | [DISCUSS] KIP-917: Additional custom metadata for remote
> log
> > >> > > segment |
> > >> > > > Hello!
> > >> > > >
> > >> > > > I would like to start the discussion thread on KIP-917:
> Additional
> > >> custom
> > >> > > > metadata for remote log segment [1]
> > >> > > > This KIP is fairly small and proposes to add a new field to the
> > >> remote
> > >> > > > segment metadata.
> > >> > > >
> > >> > > > Thank you!
> > >> > > >
> > >> > > > Best,
> > >> > > > Ivan
> > >> > > >
> > >> > > > [1]
> > >> > > >
> > >> > >
> > >>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-917%3A+Additional+custom+metadata+for+remote+log+segment
> > >> > >
> > >>
> > >
>


Re: [DISCUSS] KIP-917: Additional custom metadata for remote log segment

2023-06-03 Thread Satish Duggana
ata with rlmMetadata increases coupling between Kafka and the
> >> plugin. For instance, the custom metadata may need to be modified
> >> outside of Kafka, but the rlmMetadata would still be cached on brokers
> >> independently of any update of custom metadata. Since both types of
> >> metadata are authored by different systems, and are cached in
> >> different layers, this may become a problem, or make plugin migration
> >> more difficult. What do you think?
> >>
> >> I have a vague memory of this being discussed back when the tiered
> >> storage KIP was started. Maybe Satish has more background on this.
> >>
> >> Thanks,
> >> Alexandre
> >>
> >> Le lun. 17 avr. 2023 à 16:50, Ivan Yurchenko
> >>  a écrit :
> >> >
> >> > Hi Alexandre,
> >> >
> >> > Thank you for your feedback!
> >> >
> >> > > One question I would have is, what is the benefit of adding these
> >> > > custom metadata in the rlmMetadata rather than letting the plugin
> >> > > manage access and persistence to them?
> >> >
> >> > Could you please elaborate? Do I understand correctly that the idea is
> >> that
> >> > the plugin will have its own storage for those custom metadata, for
> >> example
> >> > a special topic?
> >> >
> >> > > It would be possible for a user
> >> > > to use custom metadata large enough to adversely impact access to and
> >> > > caching of the rlmMetadata by Kafka.
> >> >
> >> > Since the custom metadata is 100% under control of the RSM plugin, the
> >> risk
> >> > is as big as the risk of running a third-party code (i.e. the RSM
> >> plugin).
> >> > The cluster admin must make the decision if they trust it.
> >> > To mitigate this risk and put it under control, the RSM plugin
> >> > implementations could document what custom metadata they use and
> >> estimate
> >> > their size.
> >> >
> >> > Best,
> >> > Ivan
> >> >
> >> >
> >> > On Mon, 17 Apr 2023 at 18:14, Alexandre Dupriez <
> >> alexandre.dupr...@gmail.com>
> >> > wrote:
> >> >
> >> > > Hi Ivan,
> >> > >
> >> > > Thank you for the KIP.
> >> > >
> >> > > I think the KIP clearly explains the need for out-of-band metadata
> >> > > authored and used by an implementation of the remote storage manager.
> >> > > One question I would have is, what is the benefit of adding these
> >> > > custom metadata in the rlmMetadata rather than letting the plugin
> >> > > manage access and persistence to them?
> >> > >
> >> > > Maybe one disadvantage and potential risk with the approach proposed
> >> > > in the KIP is that the rlmMetadata is not of a predefined, relatively
> >> > > constant size (although corner cases with thousands of leader epochs
> >> > > in the leader epoch map are possible). It would be possible for a user
> >> > > to use custom metadata large enough to adversely impact access to and
> >> > > caching of the rlmMetadata by Kafka.
> >> > >
> >> > > Thanks,
> >> > > Alexandre
> >> > >
> >> > > Le jeu. 6 avr. 2023 à 16:03, hzh0425  a écrit :
> >> > > >
> >> > > > I think it's a good idea as we may want to store remote segments in
> >> > > different buckets
> >> > > >
> >> > > >
> >> > > >
> >> > > > | |
> >> > > > hzhka...@163.com
> >> > > > |
> >> > > > |
> >> > > > 邮箱:hzhka...@163.com
> >> > > > |
> >> > > >
> >> > > >
> >> > > >
> >> > > >
> >> > > >  回复的原邮件 
> >> > > > | 发件人 | Ivan Yurchenko |
> >> > > > | 日期 | 2023年04月06日 22:37 |
> >> > > > | 收件人 | dev@kafka.apache.org |
> >> > > > | 抄送至 | |
> >> > > > | 主题 | [DISCUSS] KIP-917: Additional custom metadata for remote log
> >> > > segment |
> >> > > > Hello!
> >> > > >
> >> > > > I would like to start the discussion thread on KIP-917: Additional
> >> custom
> >> > > > metadata for remote log segment [1]
> >> > > > This KIP is fairly small and proposes to add a new field to the
> >> remote
> >> > > > segment metadata.
> >> > > >
> >> > > > Thank you!
> >> > > >
> >> > > > Best,
> >> > > > Ivan
> >> > > >
> >> > > > [1]
> >> > > >
> >> > >
> >> https://cwiki.apache.org/confluence/display/KAFKA/KIP-917%3A+Additional+custom+metadata+for+remote+log+segment
> >> > >
> >>
> >


Re: [DISCUSS] KIP-917: Additional custom metadata for remote log segment

2023-05-29 Thread Ivan Yurchenko
ta by Kafka.
>> >
>> > Since the custom metadata is 100% under control of the RSM plugin, the
>> risk
>> > is as big as the risk of running a third-party code (i.e. the RSM
>> plugin).
>> > The cluster admin must make the decision if they trust it.
>> > To mitigate this risk and put it under control, the RSM plugin
>> > implementations could document what custom metadata they use and
>> estimate
>> > their size.
>> >
>> > Best,
>> > Ivan
>> >
>> >
>> > On Mon, 17 Apr 2023 at 18:14, Alexandre Dupriez <
>> alexandre.dupr...@gmail.com>
>> > wrote:
>> >
>> > > Hi Ivan,
>> > >
>> > > Thank you for the KIP.
>> > >
>> > > I think the KIP clearly explains the need for out-of-band metadata
>> > > authored and used by an implementation of the remote storage manager.
>> > > One question I would have is, what is the benefit of adding these
>> > > custom metadata in the rlmMetadata rather than letting the plugin
>> > > manage access and persistence to them?
>> > >
>> > > Maybe one disadvantage and potential risk with the approach proposed
>> > > in the KIP is that the rlmMetadata is not of a predefined, relatively
>> > > constant size (although corner cases with thousands of leader epochs
>> > > in the leader epoch map are possible). It would be possible for a user
>> > > to use custom metadata large enough to adversely impact access to and
>> > > caching of the rlmMetadata by Kafka.
>> > >
>> > > Thanks,
>> > > Alexandre
>> > >
>> > > Le jeu. 6 avr. 2023 à 16:03, hzh0425  a écrit :
>> > > >
>> > > > I think it's a good idea as we may want to store remote segments in
>> > > different buckets
>> > > >
>> > > >
>> > > >
>> > > > | |
>> > > > hzhka...@163.com
>> > > > |
>> > > > |
>> > > > 邮箱:hzhka...@163.com
>> > > > |
>> > > >
>> > > >
>> > > >
>> > > >
>> > > >  回复的原邮件 
>> > > > | 发件人 | Ivan Yurchenko |
>> > > > | 日期 | 2023年04月06日 22:37 |
>> > > > | 收件人 | dev@kafka.apache.org |
>> > > > | 抄送至 | |
>> > > > | 主题 | [DISCUSS] KIP-917: Additional custom metadata for remote log
>> > > segment |
>> > > > Hello!
>> > > >
>> > > > I would like to start the discussion thread on KIP-917: Additional
>> custom
>> > > > metadata for remote log segment [1]
>> > > > This KIP is fairly small and proposes to add a new field to the
>> remote
>> > > > segment metadata.
>> > > >
>> > > > Thank you!
>> > > >
>> > > > Best,
>> > > > Ivan
>> > > >
>> > > > [1]
>> > > >
>> > >
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-917%3A+Additional+custom+metadata+for+remote+log+segment
>> > >
>>
>


Re: [DISCUSS] KIP-917: Additional custom metadata for remote log segment

2023-05-05 Thread Ivan Yurchenko
gt; Thank you for the KIP.
> > >
> > > I think the KIP clearly explains the need for out-of-band metadata
> > > authored and used by an implementation of the remote storage manager.
> > > One question I would have is, what is the benefit of adding these
> > > custom metadata in the rlmMetadata rather than letting the plugin
> > > manage access and persistence to them?
> > >
> > > Maybe one disadvantage and potential risk with the approach proposed
> > > in the KIP is that the rlmMetadata is not of a predefined, relatively
> > > constant size (although corner cases with thousands of leader epochs
> > > in the leader epoch map are possible). It would be possible for a user
> > > to use custom metadata large enough to adversely impact access to and
> > > caching of the rlmMetadata by Kafka.
> > >
> > > Thanks,
> > > Alexandre
> > >
> > > Le jeu. 6 avr. 2023 à 16:03, hzh0425  a écrit :
> > > >
> > > > I think it's a good idea as we may want to store remote segments in
> > > different buckets
> > > >
> > > >
> > > >
> > > > | |
> > > > hzhka...@163.com
> > > > |
> > > > |
> > > > 邮箱:hzhka...@163.com
> > > > |
> > > >
> > > >
> > > >
> > > >
> > > >  回复的原邮件 
> > > > | 发件人 | Ivan Yurchenko |
> > > > | 日期 | 2023年04月06日 22:37 |
> > > > | 收件人 | dev@kafka.apache.org |
> > > > | 抄送至 | |
> > > > | 主题 | [DISCUSS] KIP-917: Additional custom metadata for remote log
> > > segment |
> > > > Hello!
> > > >
> > > > I would like to start the discussion thread on KIP-917: Additional
> custom
> > > > metadata for remote log segment [1]
> > > > This KIP is fairly small and proposes to add a new field to the
> remote
> > > > segment metadata.
> > > >
> > > > Thank you!
> > > >
> > > > Best,
> > > > Ivan
> > > >
> > > > [1]
> > > >
> > >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-917%3A+Additional+custom+metadata+for+remote+log+segment
> > >
>


Re: [DISCUSS] KIP-917: Additional custom metadata for remote log segment

2023-04-18 Thread Alexandre Dupriez
Hi Ivan,

Thanks for the follow-up.

Yes, you are right that the suggested alternative is to let the plugin
store its own metadata separately with a solution chosen by the admin
or plugin provider. For instance, it could be using a dedicated topic
if chosen to, or relying on an external key-value store.

I agree with you on the existing risks associated with running
third-party code inside Apache Kafka. That said, combining custom
metadata with rlmMetadata increases coupling between Kafka and the
plugin. For instance, the custom metadata may need to be modified
outside of Kafka, but the rlmMetadata would still be cached on brokers
independently of any update of custom metadata. Since both types of
metadata are authored by different systems, and are cached in
different layers, this may become a problem, or make plugin migration
more difficult. What do you think?

I have a vague memory of this being discussed back when the tiered
storage KIP was started. Maybe Satish has more background on this.

Thanks,
Alexandre

Le lun. 17 avr. 2023 à 16:50, Ivan Yurchenko
 a écrit :
>
> Hi Alexandre,
>
> Thank you for your feedback!
>
> > One question I would have is, what is the benefit of adding these
> > custom metadata in the rlmMetadata rather than letting the plugin
> > manage access and persistence to them?
>
> Could you please elaborate? Do I understand correctly that the idea is that
> the plugin will have its own storage for those custom metadata, for example
> a special topic?
>
> > It would be possible for a user
> > to use custom metadata large enough to adversely impact access to and
> > caching of the rlmMetadata by Kafka.
>
> Since the custom metadata is 100% under control of the RSM plugin, the risk
> is as big as the risk of running a third-party code (i.e. the RSM plugin).
> The cluster admin must make the decision if they trust it.
> To mitigate this risk and put it under control, the RSM plugin
> implementations could document what custom metadata they use and estimate
> their size.
>
> Best,
> Ivan
>
>
> On Mon, 17 Apr 2023 at 18:14, Alexandre Dupriez 
> wrote:
>
> > Hi Ivan,
> >
> > Thank you for the KIP.
> >
> > I think the KIP clearly explains the need for out-of-band metadata
> > authored and used by an implementation of the remote storage manager.
> > One question I would have is, what is the benefit of adding these
> > custom metadata in the rlmMetadata rather than letting the plugin
> > manage access and persistence to them?
> >
> > Maybe one disadvantage and potential risk with the approach proposed
> > in the KIP is that the rlmMetadata is not of a predefined, relatively
> > constant size (although corner cases with thousands of leader epochs
> > in the leader epoch map are possible). It would be possible for a user
> > to use custom metadata large enough to adversely impact access to and
> > caching of the rlmMetadata by Kafka.
> >
> > Thanks,
> > Alexandre
> >
> > Le jeu. 6 avr. 2023 à 16:03, hzh0425  a écrit :
> > >
> > > I think it's a good idea as we may want to store remote segments in
> > different buckets
> > >
> > >
> > >
> > > | |
> > > hzhka...@163.com
> > > |
> > > |
> > > 邮箱:hzhka...@163.com
> > > |
> > >
> > >
> > >
> > >
> > >  回复的原邮件 
> > > | 发件人 | Ivan Yurchenko |
> > > | 日期 | 2023年04月06日 22:37 |
> > > | 收件人 | dev@kafka.apache.org |
> > > | 抄送至 | |
> > > | 主题 | [DISCUSS] KIP-917: Additional custom metadata for remote log
> > segment |
> > > Hello!
> > >
> > > I would like to start the discussion thread on KIP-917: Additional custom
> > > metadata for remote log segment [1]
> > > This KIP is fairly small and proposes to add a new field to the remote
> > > segment metadata.
> > >
> > > Thank you!
> > >
> > > Best,
> > > Ivan
> > >
> > > [1]
> > >
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-917%3A+Additional+custom+metadata+for+remote+log+segment
> >


Re: [DISCUSS] KIP-917: Additional custom metadata for remote log segment

2023-04-17 Thread Ivan Yurchenko
Hi Alexandre,

Thank you for your feedback!

> One question I would have is, what is the benefit of adding these
> custom metadata in the rlmMetadata rather than letting the plugin
> manage access and persistence to them?

Could you please elaborate? Do I understand correctly that the idea is that
the plugin will have its own storage for those custom metadata, for example
a special topic?

> It would be possible for a user
> to use custom metadata large enough to adversely impact access to and
> caching of the rlmMetadata by Kafka.

Since the custom metadata is 100% under control of the RSM plugin, the risk
is as big as the risk of running a third-party code (i.e. the RSM plugin).
The cluster admin must make the decision if they trust it.
To mitigate this risk and put it under control, the RSM plugin
implementations could document what custom metadata they use and estimate
their size.

Best,
Ivan


On Mon, 17 Apr 2023 at 18:14, Alexandre Dupriez 
wrote:

> Hi Ivan,
>
> Thank you for the KIP.
>
> I think the KIP clearly explains the need for out-of-band metadata
> authored and used by an implementation of the remote storage manager.
> One question I would have is, what is the benefit of adding these
> custom metadata in the rlmMetadata rather than letting the plugin
> manage access and persistence to them?
>
> Maybe one disadvantage and potential risk with the approach proposed
> in the KIP is that the rlmMetadata is not of a predefined, relatively
> constant size (although corner cases with thousands of leader epochs
> in the leader epoch map are possible). It would be possible for a user
> to use custom metadata large enough to adversely impact access to and
> caching of the rlmMetadata by Kafka.
>
> Thanks,
> Alexandre
>
> Le jeu. 6 avr. 2023 à 16:03, hzh0425  a écrit :
> >
> > I think it's a good idea as we may want to store remote segments in
> different buckets
> >
> >
> >
> > | |
> > hzhka...@163.com
> > |
> > |
> > 邮箱:hzhka...@163.com
> > |
> >
> >
> >
> >
> > ---- 回复的原邮件 ----
> > | 发件人 | Ivan Yurchenko |
> > | 日期 | 2023年04月06日 22:37 |
> > | 收件人 | dev@kafka.apache.org |
> > | 抄送至 | |
> > | 主题 | [DISCUSS] KIP-917: Additional custom metadata for remote log
> segment |
> > Hello!
> >
> > I would like to start the discussion thread on KIP-917: Additional custom
> > metadata for remote log segment [1]
> > This KIP is fairly small and proposes to add a new field to the remote
> > segment metadata.
> >
> > Thank you!
> >
> > Best,
> > Ivan
> >
> > [1]
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-917%3A+Additional+custom+metadata+for+remote+log+segment
>


Re: [DISCUSS] KIP-917: Additional custom metadata for remote log segment

2023-04-17 Thread Alexandre Dupriez
Hi Ivan,

Thank you for the KIP.

I think the KIP clearly explains the need for out-of-band metadata
authored and used by an implementation of the remote storage manager.
One question I would have is, what is the benefit of adding these
custom metadata in the rlmMetadata rather than letting the plugin
manage access and persistence to them?

Maybe one disadvantage and potential risk with the approach proposed
in the KIP is that the rlmMetadata is not of a predefined, relatively
constant size (although corner cases with thousands of leader epochs
in the leader epoch map are possible). It would be possible for a user
to use custom metadata large enough to adversely impact access to and
caching of the rlmMetadata by Kafka.

Thanks,
Alexandre

Le jeu. 6 avr. 2023 à 16:03, hzh0425  a écrit :
>
> I think it's a good idea as we may want to store remote segments in different 
> buckets
>
>
>
> | |
> hzhka...@163.com
> |
> |
> 邮箱:hzhka...@163.com
> |
>
>
>
>
>  回复的原邮件 
> | 发件人 | Ivan Yurchenko |
> | 日期 | 2023年04月06日 22:37 |
> | 收件人 | dev@kafka.apache.org |
> | 抄送至 | |
> | 主题 | [DISCUSS] KIP-917: Additional custom metadata for remote log segment |
> Hello!
>
> I would like to start the discussion thread on KIP-917: Additional custom
> metadata for remote log segment [1]
> This KIP is fairly small and proposes to add a new field to the remote
> segment metadata.
>
> Thank you!
>
> Best,
> Ivan
>
> [1]
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-917%3A+Additional+custom+metadata+for+remote+log+segment


Re: [DISCUSS] KIP-917: Additional custom metadata for remote log segment

2023-04-06 Thread hzh0425
I think it's a good idea as we may want to store remote segments in different 
buckets



| |
hzhka...@163.com
|
|
邮箱:hzhka...@163.com
|




 回复的原邮件 
| 发件人 | Ivan Yurchenko |
| 日期 | 2023年04月06日 22:37 |
| 收件人 | dev@kafka.apache.org |
| 抄送至 | |
| 主题 | [DISCUSS] KIP-917: Additional custom metadata for remote log segment |
Hello!

I would like to start the discussion thread on KIP-917: Additional custom
metadata for remote log segment [1]
This KIP is fairly small and proposes to add a new field to the remote
segment metadata.

Thank you!

Best,
Ivan

[1]
https://cwiki.apache.org/confluence/display/KAFKA/KIP-917%3A+Additional+custom+metadata+for+remote+log+segment


[DISCUSS] KIP-917: Additional custom metadata for remote log segment

2023-04-06 Thread Ivan Yurchenko
Hello!

I would like to start the discussion thread on KIP-917: Additional custom
metadata for remote log segment [1]
This KIP is fairly small and proposes to add a new field to the remote
segment metadata.

Thank you!

Best,
Ivan

[1]
https://cwiki.apache.org/confluence/display/KAFKA/KIP-917%3A+Additional+custom+metadata+for+remote+log+segment