Re: [DISCUSS] KIP-950: Tiered Storage Disablement

Kamal Chandraprakash Thu, 09 May 2024 10:01:41 -0700

Christo,

IMO, we can converge the implementation for both the policies
and remove the remote-log-segments in async fashion.


The only difference between the "delete" and "retain" policies is
that, for "delete" we move the log-start-offset to match with the
local-log-start-offset and then the expiration threads will clean
those remote-log segments due to breach by log-start-offset.
The proposal is that we use the same remote-log-expiration threads
to delete the remote-log-segments instead of the request-handler-thread
as the time taken to delete the remote log segments is non-deterministic.

For both the policies, we have to stop the copier threads and close
the RLM task once all the segments have expired from the remote
storage.

https://github.com/apache/kafka/pull/13947/files#r1281675818

--
Kamal

On Thu, May 9, 2024 at 6:51 PM Christo Lolov <christolo...@gmail.com> wrote:

> Heya!
>
> re: Luke
>
> 1. I am not certain I follow the question. From DISABLED you can only go to
> ENABLED regardless of whether your cluster is backed by Zookeeper or KRaft.
> Am I misunderstanding your point?
>
> 2. Apologies, this was a leftover from previous versions. I have updated
> the Zookeeper section. The steps ought to be: controller receives change,
> commits necessary data to Zookeeper, enqueues disablement and starts
> sending StopReplicas request to brokers; brokers receive StopReplicas and
> propagate them all the way to RemoteLogManager#stopPartitions which takes
> care of the rest.
>
> 3. Correct, it should say DISABLED - this should now be corrected.
>
> 4. I was thinking that if there is a mismatch we will just fail accepting
> the request for disablement. This should be the same in both Zookeeper and
> KRaft. Or am I misunderstanding your question?
>
> 5. Yeah. I am now doing a second pass on all diagrams and will update them
> by the end of the day!
>
> 6. I think my current train of thought is that there will be unlimited
> retries until all brokers respond in a similar way to how deletion of a
> topic works today in ZK. In the meantime the state will continue to be
> DISABLING. Do you have a better suggestion?
>
> re: Kamal
>
> Yep, I will update all diagrams
>
> I am not certain I follow the reasoning for making retain and delete the
> same. Deletion when the policy is retain happens asynchronously due to
> expiration. I think that deletion when the policy is delete ought to (at
> least for the initial implementation) happen synchronously. Should people
> run into timeout problems we can always then have a follow-up KIP where we
> make it asynchronous.
>
> Best,
> Christo
>
> On Tue, 7 May 2024 at 10:04, Kamal Chandraprakash <
> kamal.chandraprak...@gmail.com> wrote:
>
> > Hi Christo,
> >
> > Thanks for the update!
> >
> > For both the policies "retain" and "delete", can we maintain the same
> > approach to delete the segments async?
> >
> > > If the disablement policy is set to delete, the Log start offset (LSO)
> is
> > updated to match the Local Log Start Offset and the remote log is deleted
> > by calling the RemoteStorageManager#deleteLogSegmentData().
> >
> > In the KIP, it's mentioned that when the disable policy is set to
> "delete",
> > the remote-log-segments will be
> > deleted in-sync. The stopPartition call might get timed out when the
> number
> > of remote log segments to
> > delete is huge. We can further extend the same approach for the topic
> > deletion requests.
> >
> > Also, Could you please update the state diagram about the transitions? It
> > is not clear when to transit from
> > DISABLING to DISABLED state?
> >
> > --
> > Kamal
> >
> > On Mon, May 6, 2024 at 6:55 PM Kamal Chandraprakash <
> > kamal.chandraprak...@gmail.com> wrote:
> >
> > > Ignore the above message. Got the answers after reading the state
> > > transition section.
> > >
> > > > If the disablement policy is delete, tasks scheduled for the
> > > topic-partitions in the RemoteDataExpirationThreadPool will also be
> > > canceled.
> > >
> > > We are deleting the segments synchronously. Should we delete them
> > > asynchronously? The same approach can be extended to topic deletion
> > > requests.
> > >
> > > > 6. In ZK mode, what will the controller do if the "stopReplicas"
> > > responses not received from all brokers? Reverting the changes?
> > >
> > > Since we are deleting the segments synchronously. This case can be
> bound
> > > to happen when the number of remote log segments to
> > > delete is huge.
> > >
> > >
> > > On Mon, May 6, 2024, 18:12 Kamal Chandraprakash <
> > > kamal.chandraprak...@gmail.com> wrote:
> > >
> > >> Hi Christo,
> > >>
> > >> Thanks for the update!
> > >>
> > >> 1. In the ZK mode, how will the transition from DISABLING to DISABLED
> > >> state happen?
> > >> For the "retain" policy, until we delete all the remote-log segments,
> > the
> > >> state will be
> > >> DISABLING and the deletion can happen only when they breach either the
> > >> retention
> > >> time (or) size.
> > >>
> > >> How does the controller monitor that all the remote log segments are
> > >> deleted for all
> > >> the partitions of the topic before transitioning the state to
> DISABLED?
> > >>
> > >> 2. In Kraft, we have only ENABLED -> DISABLED state. How are we
> > >> supporting the case
> > >> "retain" -> "enable"?
> > >>
> > >> If the remote storage is degraded, we want to avoid uploading the
> > >> segments temporarily
> > >> and resume back once the remote storage is healthy. Is the case
> > supported?
> > >>
> > >>
> > >>
> > >> On Fri, May 3, 2024 at 12:12 PM Luke Chen <show...@gmail.com> wrote:
> > >>
> > >>> Also, I think using `stopReplicas` request is a good idea because it
> > >>> won't cause any problems while migrating to KRaft mode.
> > >>> The stopReplicas request is one of the request that KRaft controller
> > >>> will send to ZK brokers during migration.
> > >>>
> > >>> Thanks.
> > >>> Luke
> > >>>
> > >>> On Fri, May 3, 2024 at 11:48 AM Luke Chen <show...@gmail.com> wrote:
> > >>>
> > >>>> Hi Christo,
> > >>>>
> > >>>> Thanks for the update.
> > >>>>
> > >>>> Questions:
> > >>>> 1. For this
> > >>>> "The possible state transition from DISABLED state is to the
> ENABLED."
> > >>>> I think it only applies for KRaft mode. In ZK mode, the possible
> state
> > >>>> is "DISABLING", right?
> > >>>>
> > >>>> 2. For this:
> > >>>> "If the cluster is using Zookeeper as the control plane, enabling
> > >>>> remote storage for a topic triggers the controller to send this
> > information
> > >>>> to Zookeeper. Each broker listens for changes in Zookeeper, and
> when a
> > >>>> change is detected, the broker triggers
> > >>>> RemoteLogManager#onLeadershipChange()."
> > >>>>
> > >>>> I think the way ZK brokers knows the leadership change is by getting
> > >>>> the LeaderAndISRRequeset from the controller, not listening for
> > changes in
> > >>>> ZK.
> > >>>>
> > >>>> 3. In the KRaft handler steps, you said:
> > >>>> "The controller also updates the Topic metadata to increment the
> > >>>> tiered_epoch and update the tiered_stateto DISABLING state."
> > >>>>
> > >>>> Should it be "DISABLED" state since it's KRaft mode?
> > >>>>
> > >>>> 4. I was thinking how we handle the tiered_epoch not match error.
> > >>>> For ZK, I think the controller won't write any data into ZK Znode,
> > >>>> For KRaft, either configRecord or updateTopicMetadata records won't
> be
> > >>>> written.
> > >>>> Is that right? Because the current workflow makes me think there
> will
> > >>>> be partial data updated in ZK/KRaft when tiered_epoch error.
> > >>>>
> > >>>> 5. Since we changed to use stopReplicas (V5) request now, the
> diagram
> > >>>> for ZK workflow might also need to update.
> > >>>>
> > >>>> 6. In ZK mode, what will the controller do if the "stopReplicas"
> > >>>> responses not received from all brokers? Reverting the changes?
> > >>>> This won't happen in KRaft mode because it's broker's responsibility
> > to
> > >>>> fetch metadata update from controller.
> > >>>>
> > >>>>
> > >>>> Thank you.
> > >>>> Luke
> > >>>>
> > >>>>
> > >>>> On Fri, Apr 19, 2024 at 10:23 PM Christo Lolov <
> > christolo...@gmail.com>
> > >>>> wrote:
> > >>>>
> > >>>>> Heya all!
> > >>>>>
> > >>>>> I have updated KIP-950. A list of what I have updated is:
> > >>>>>
> > >>>>> * Explicitly state that Zookeeper-backed clusters will have ENABLED
> > ->
> > >>>>> DISABLING -> DISABLED while KRaft-backed clusters will only have
> > ENABLED ->
> > >>>>> DISABLED
> > >>>>> * Added two configurations for the new thread pools and explained
> > >>>>> where values will be picked-up mid Kafka version upgrade
> > >>>>> * Explained how leftover remote partitions will be scheduled for
> > >>>>> deletion
> > >>>>> * Updated the API to use StopReplica V5 rather than a whole new
> > >>>>> controller-to-broker API
> > >>>>> * Explained that the disablement procedure will be triggered by the
> > >>>>> controller listening for an (Incremental)AlterConfig change
> > >>>>> * Explained that we will first move log start offset and then
> issue a
> > >>>>> deletion
> > >>>>> * Went into more details that changing remote.log.disable.policy
> > after
> > >>>>> disablement won't do anything and that if a customer would like
> > additional
> > >>>>> data deleted they would have to use already existing methods
> > >>>>>
> > >>>>> Let me know if there are any new comments or I have missed
> something!
> > >>>>>
> > >>>>> Best,
> > >>>>> Christo
> > >>>>>
> > >>>>> On Mon, 15 Apr 2024 at 12:40, Christo Lolov <
> christolo...@gmail.com>
> > >>>>> wrote:
> > >>>>>
> > >>>>>> Heya Doguscan,
> > >>>>>>
> > >>>>>> I believe that the state of the world after this KIP will be the
> > >>>>>> following:
> > >>>>>>
> > >>>>>> For Zookeeper-backed clusters there will be 3 states: ENABLED,
> > >>>>>> DISABLING and DISABLED. We want this because Zookeeper-backed
> > clusters will
> > >>>>>> await a confirmation from the brokers that they have indeed
> stopped
> > >>>>>> tiered-related operations on the topic.
> > >>>>>>
> > >>>>>> For KRaft-backed clusters there will be only 2 states: ENABLED and
> > >>>>>> DISABLED. KRaft takes a fire-and-forget approach for topic
> > deletion. I
> > >>>>>> believe the same approach ought to be taken for tiered topics. The
> > >>>>>> mechanism which will ensure that leftover state in remote due to
> > failures
> > >>>>>> is cleaned up to me is the retention mechanism. In today's code, a
> > leader
> > >>>>>> deletes all segments it finds in remote with offsets below the log
> > start
> > >>>>>> offset. I believe this will be good enough for cleaning up
> leftover
> > state
> > >>>>>> in remote due to failures.
> > >>>>>>
> > >>>>>> I know that quite a few changes have been discussed so I will aim
> to
> > >>>>>> put them on paper in the upcoming days and let everyone know!
> > >>>>>>
> > >>>>>> Best,
> > >>>>>> Christo
> > >>>>>>
> > >>>>>> On Tue, 9 Apr 2024 at 14:49, Doğuşcan Namal <
> > namal.dogus...@gmail.com>
> > >>>>>> wrote:
> > >>>>>>
> > >>>>>>> +1 let's not introduce a new api and mark it immediately as
> > >>>>>>> deprecated :)
> > >>>>>>>
> > >>>>>>> On your second comment Luke, one thing we need to clarify is when
> > do
> > >>>>>>> we consider remote storage to be DISABLED for a topic?
> > >>>>>>> Particularly, what is the state when the remote storage is being
> > >>>>>>> deleted in case of disablement.policy=delete? Is it DISABLING or
> > DISABLED?
> > >>>>>>>
> > >>>>>>> If we move directly to the DISABLED state,
> > >>>>>>>
> > >>>>>>> a) in case of failures, the leaders should continue remote
> storage
> > >>>>>>> deletion even if the topic is moved to the DISABLED state,
> > otherwise we
> > >>>>>>> risk having stray data on remote storage.
> > >>>>>>> b) on each restart, we should initiate the remote storage
> deletion
> > >>>>>>> because although we replayed a record with a DISABLED state, we
> > can not be
> > >>>>>>> sure if the remote data is deleted or not.
> > >>>>>>>
> > >>>>>>> We could either consider keeping the remote topic in DISABLING
> > state
> > >>>>>>> until all of the remote storage data is deleted, or we need an
> > additional
> > >>>>>>> mechanism to handle the remote stray data.
> > >>>>>>>
> > >>>>>>> The existing topic deletion, for instance, handles stray logs on
> > >>>>>>> disk by detecting them on KafkaBroker startup and deleting before
> > the
> > >>>>>>> ReplicaManager is started.
> > >>>>>>> Maybe we need a similar mechanism here as well if we don't want a
> > >>>>>>> DISABLING state. Otherwise, we need a callback from Brokers to
> > validate
> > >>>>>>> that remote storage data is deleted and now we could move to the
> > DISABLED
> > >>>>>>> state.
> > >>>>>>>
> > >>>>>>> Thanks.
> > >>>>>>>
> > >>>>>>> On Tue, 9 Apr 2024 at 12:45, Luke Chen <show...@gmail.com>
> wrote:
> > >>>>>>>
> > >>>>>>>> Hi Christo,
> > >>>>>>>>
> > >>>>>>>> > I would then opt for moving information from
> DisableRemoteTopic
> > >>>>>>>> within the StopReplicas API which will then disappear in KRaft
> > >>>>>>>> world as it
> > >>>>>>>> is already scheduled for deprecation. What do you think?
> > >>>>>>>>
> > >>>>>>>> Sounds good to me.
> > >>>>>>>>
> > >>>>>>>> Thanks.
> > >>>>>>>> Luke
> > >>>>>>>>
> > >>>>>>>> On Tue, Apr 9, 2024 at 6:46 PM Christo Lolov <
> > >>>>>>>> christolo...@gmail.com> wrote:
> > >>>>>>>>
> > >>>>>>>> > Heya Luke!
> > >>>>>>>> >
> > >>>>>>>> > I thought a bit more about it and I reached the same
> conclusion
> > >>>>>>>> as you for
> > >>>>>>>> > 2 as a follow-up from 1. In other words, in KRaft world I
> don't
> > >>>>>>>> think the
> > >>>>>>>> > controller needs to wait for acknowledgements for the brokers.
> > >>>>>>>> All we care
> > >>>>>>>> > about is that the leader (who is responsible for
> > >>>>>>>> archiving/deleting data in
> > >>>>>>>> > tiered storage) knows about the change and applies it
> properly.
> > >>>>>>>> If there is
> > >>>>>>>> > a leadership change halfway through the operation then the new
> > >>>>>>>> leader still
> > >>>>>>>> > needs to apply the message from the state topic and we know
> > that a
> > >>>>>>>> > disable-message will be applied before a
> reenablement-message. I
> > >>>>>>>> will
> > >>>>>>>> > change the KIP later today/tomorrow morning to reflect this
> > >>>>>>>> reasoning.
> > >>>>>>>> >
> > >>>>>>>> > However, with this I believe that introducing a new API just
> for
> > >>>>>>>> > Zookeeper-based clusters (i.e. DisableRemoteTopic) becomes a
> bit
> > >>>>>>>> of an
> > >>>>>>>> > overkill. I would then opt for moving information from
> > >>>>>>>> DisableRemoteTopic
> > >>>>>>>> > within the StopReplicas API which will then disappear in KRaft
> > >>>>>>>> world as it
> > >>>>>>>> > is already scheduled for deprecation. What do you think?
> > >>>>>>>> >
> > >>>>>>>> > Best,
> > >>>>>>>> > Christo
> > >>>>>>>> >
> > >>>>>>>> > On Wed, 3 Apr 2024 at 07:59, Luke Chen <show...@gmail.com>
> > wrote:
> > >>>>>>>> >
> > >>>>>>>> > > Hi Christo,
> > >>>>>>>> > >
> > >>>>>>>> > > 1. I agree with Doguscan that in KRaft mode, the controller
> > >>>>>>>> won't send
> > >>>>>>>> > RPCs
> > >>>>>>>> > > to the brokers (except in the migration path).
> > >>>>>>>> > > So, I think we could adopt the similar way we did to
> > >>>>>>>> > `AlterReplicaLogDirs`
> > >>>>>>>> > > (
> > >>>>>>>> > > KIP-858
> > >>>>>>>> > > <
> > >>>>>>>> > >
> > >>>>>>>> >
> > >>>>>>>>
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-858%3A+Handle+JBOD+broker+disk+failure+in+KRaft#KIP858:HandleJBODbrokerdiskfailureinKRaft-Intra-brokerreplicamovement
> > >>>>>>>> > > >)
> > >>>>>>>> > > that let the broker notify controller any update, instead of
> > >>>>>>>> controller
> > >>>>>>>> > to
> > >>>>>>>> > > broker. And once the controller receives all the complete
> > >>>>>>>> requests from
> > >>>>>>>> > > brokers, it'll enter "Disabled" state. WDYT?
> > >>>>>>>> > >
> > >>>>>>>> > > 2. Why should we wait until all brokers to respond before
> > >>>>>>>> moving to
> > >>>>>>>> > > "Disabled" state in "KRaft mode"?
> > >>>>>>>> > > Currently, only the leader node does the remote log
> > >>>>>>>> upload/fetch tasks,
> > >>>>>>>> > so
> > >>>>>>>> > > does that mean the controller only need to make sure the
> > leader
> > >>>>>>>> completes
> > >>>>>>>> > > the stopPartition?
> > >>>>>>>> > > If during the leader node stopPartition process triggered
> > >>>>>>>> leadership
> > >>>>>>>> > > change, then the new leader should receive and apply the
> > >>>>>>>> configRecord
> > >>>>>>>> > > update before the leadership change record based on the
> KRaft
> > >>>>>>>> design,
> > >>>>>>>> > which
> > >>>>>>>> > > means there will be no gap that the follower node becomes
> the
> > >>>>>>>> leader and
> > >>>>>>>> > > starting doing unexpected upload/fetch tasks, right?
> > >>>>>>>> > > I agree we should make sure in ZK mode, all brokers are
> > >>>>>>>> completed the
> > >>>>>>>> > > stopPartitions before moving to "Disabled" state because ZK
> > >>>>>>>> node watcher
> > >>>>>>>> > is
> > >>>>>>>> > > working in a separate thread. But not sure about KRaft mode.
> > >>>>>>>> > >
> > >>>>>>>> > > Thanks.
> > >>>>>>>> > > Luke
> > >>>>>>>> > >
> > >>>>>>>> > >
> > >>>>>>>> > > On Fri, Mar 29, 2024 at 4:14 PM Christo Lolov <
> > >>>>>>>> christolo...@gmail.com>
> > >>>>>>>> > > wrote:
> > >>>>>>>> > >
> > >>>>>>>> > > > Heya everyone!
> > >>>>>>>> > > >
> > >>>>>>>> > > > re: Doguscan
> > >>>>>>>> > > >
> > >>>>>>>> > > > I believe the answer to 101 needs a bit more discussion.
> As
> > >>>>>>>> far as I
> > >>>>>>>> > > know,
> > >>>>>>>> > > > tiered storage today has methods to update a metadata of a
> > >>>>>>>> segment to
> > >>>>>>>> > say
> > >>>>>>>> > > > "hey, I would like this deleted", but actual deletion is
> > left
> > >>>>>>>> to plugin
> > >>>>>>>> > > > implementations (or any background cleaners). In other
> > words,
> > >>>>>>>> there is
> > >>>>>>>> > no
> > >>>>>>>> > > > "immediate" deletion. In this KIP, we would like to
> continue
> > >>>>>>>> doing the
> > >>>>>>>> > > same
> > >>>>>>>> > > > if the retention policy is set to delete. So I believe the
> > >>>>>>>> answer is
> > >>>>>>>> > > > actually that a) we will update the metadata of the
> segments
> > >>>>>>>> to mark
> > >>>>>>>> > them
> > >>>>>>>> > > > as deleted and b) we will advance the log start offset.
> Any
> > >>>>>>>> deletion of
> > >>>>>>>> > > > actual files will still be delegated to plugin
> > >>>>>>>> implementations. I
> > >>>>>>>> > believe
> > >>>>>>>> > > > this is further supported by
> > >>>>>>>> "*remote.log.disable.policy=delete:* Logs
> > >>>>>>>> > > that
> > >>>>>>>> > > > are archived in the remote storage will not be part of the
> > >>>>>>>> contiguous
> > >>>>>>>> > > > "active" log and will be deleted asynchronously as part of
> > the
> > >>>>>>>> > > disablement
> > >>>>>>>> > > > process"
> > >>>>>>>> > > >
> > >>>>>>>> > > > Following from the above, I believe for 102 it is fine to
> > >>>>>>>> allow setting
> > >>>>>>>> > > of
> > >>>>>>>> > > > remote.log.disable.policy on a disabled topic in much the
> > >>>>>>>> same way we
> > >>>>>>>> > > allow
> > >>>>>>>> > > > other remote-related configurations to be set on a topic
> > (i.e.
> > >>>>>>>> > > > local.retention.*) - it just won't have an effect.
> Granted,
> > I
> > >>>>>>>> do
> > >>>>>>>> > believe
> > >>>>>>>> > > we
> > >>>>>>>> > > > should restrict the policy being changed while a
> disablement
> > >>>>>>>> is
> > >>>>>>>> > ongoing.
> > >>>>>>>> > > >
> > >>>>>>>> > > > re: Satish and Kamal
> > >>>>>>>> > > >
> > >>>>>>>> > > > 104, 1 and 2 are fair asks, I will work with Doguscan to
> > >>>>>>>> update the KIP
> > >>>>>>>> > > > with the information!
> > >>>>>>>> > > >
> > >>>>>>>> > > > Best,
> > >>>>>>>> > > > Christo
> > >>>>>>>> > > >
> > >>>>>>>> > > > On Thu, 28 Mar 2024 at 10:31, Doğuşcan Namal <
> > >>>>>>>> namal.dogus...@gmail.com
> > >>>>>>>> > >
> > >>>>>>>> > > > wrote:
> > >>>>>>>> > > >
> > >>>>>>>> > > > > Hi Satish, I will try to answer as much as I can and the
> > >>>>>>>> others could
> > >>>>>>>> > > > chime
> > >>>>>>>> > > > > in with further details.
> > >>>>>>>> > > > >
> > >>>>>>>> > > > >
> > >>>>>>>> > > > >
> > >>>>>>>> > > > >
> > >>>>>>>> > > > >
> > >>>>>>>> > > > > *101. For remote.log.disable.policy=delete: Does it
> delete
> > >>>>>>>> the remote
> > >>>>>>>> > > log
> > >>>>>>>> > > > > data immediately and the data in remote storage will not
> > be
> > >>>>>>>> taken
> > >>>>>>>> > into
> > >>>>>>>> > > > > account by any replica? That means log-start-offset is
> > >>>>>>>> moved to the
> > >>>>>>>> > > > earlier
> > >>>>>>>> > > > > local-log-start-offset.*
> > >>>>>>>> > > > > *Exactly. RemoteLogData will be deleted immediately. *
> > >>>>>>>> > > > > *So before the deletion starts we move LogStart offset
> to
> > >>>>>>>> > > > > LocalLogStartOffset to ensure that no RemoteLog will be
> > >>>>>>>> accessed
> > >>>>>>>> > after
> > >>>>>>>> > > > that
> > >>>>>>>> > > > > point.*
> > >>>>>>>> > > > >
> > >>>>>>>> > > > >
> > >>>>>>>> > > > > * 102. Can we update the remote.log.disable.policy after
> > >>>>>>>> tiered
> > >>>>>>>> > storage
> > >>>>>>>> > > > is
> > >>>>>>>> > > > > disabled on a topic?*
> > >>>>>>>> > > > >
> > >>>>>>>> > > > > *This is a good point. I think we should not allow
> > >>>>>>>> modifying this
> > >>>>>>>> > > > > configuration*
> > >>>>>>>> > > > > *because changing the policy from Deletion to Retain
> when
> > >>>>>>>> there is an
> > >>>>>>>> > > > > ongoing Deletion will result in an undefined behaviour
> and
> > >>>>>>>> where we
> > >>>>>>>> > > > retain
> > >>>>>>>> > > > > half of the remote log and delete the other half.*
> > >>>>>>>> > > > >
> > >>>>>>>> > > > > * 103. Do we plan to add any metrics related to this
> > >>>>>>>> feature?*
> > >>>>>>>> > > > >
> > >>>>>>>> > > > >
> > >>>>>>>> > > > >
> > >>>>>>>> > > > > *Any recommendations?*
> > >>>>>>>> > > > > *We may emit a gauge showing the enablement state of a
> > >>>>>>>> topic but we
> > >>>>>>>> > > could
> > >>>>>>>> > > > > gather that info from the logs as well.*
> > >>>>>>>> > > > > *The total duration for remote topic deletion could be
> > >>>>>>>> added as well
> > >>>>>>>> > > but
> > >>>>>>>> > > > > this is more of a metric for the RemotePartitionRemover
> > >>>>>>>> itself.*
> > >>>>>>>> > > > >
> > >>>>>>>> > > > >
> > >>>>>>>> > > > >
> > >>>>>>>> > > > > *104. Please add configuration details about copier
> thread
> > >>>>>>>> pool,
> > >>>>>>>> > > > expiration
> > >>>>>>>> > > > > thread pool and the migration of the existing
> > >>>>>>>> > > > > RemoteLogManagerScheduledThreadPool.*
> > >>>>>>>> > > > >
> > >>>>>>>> > > > > *Will add the details.*
> > >>>>>>>> > > > >
> > >>>>>>>> > > > > 105. How is the behaviour with topic or partition
> deletion
> > >>>>>>>> request
> > >>>>>>>> > > > > handled when tiered storage disablement request is still
> > >>>>>>>> being
> > >>>>>>>> > > > > processed on a topic?
> > >>>>>>>> > > > >
> > >>>>>>>> > > > > *If the disablement policy is Delete then a successive
> > >>>>>>>> topic deletion
> > >>>>>>>> > > > > request is going to be a NOOP because RemoteLogs is
> > already
> > >>>>>>>> being
> > >>>>>>>> > > > deleted.*
> > >>>>>>>> > > > > *If the disablement policy is Retain, then we only moved
> > the
> > >>>>>>>> > > > LogStartOffset
> > >>>>>>>> > > > > and didn't touch RemoteLogs anyway, so the delete topic
> > >>>>>>>> request will
> > >>>>>>>> > > > > result*
> > >>>>>>>> > > > >
> > >>>>>>>> > > > > *in the initiation of RemoteLog deletion.*
> > >>>>>>>> > > > >
> > >>>>>>>> > > > >
> > >>>>>>>> > > > > On Tue, 26 Mar 2024 at 18:21, Kamal Chandraprakash <
> > >>>>>>>> > > > > kamal.chandraprak...@gmail.com> wrote:
> > >>>>>>>> > > > >
> > >>>>>>>> > > > > > Hi,
> > >>>>>>>> > > > > >
> > >>>>>>>> > > > > > Thanks for the KIP! Overall the KIP looks good and
> > >>>>>>>> covered most of
> > >>>>>>>> > > the
> > >>>>>>>> > > > > > items.
> > >>>>>>>> > > > > >
> > >>>>>>>> > > > > > 1. Could you explain how the brokers will handle the
> > >>>>>>>> > > DisableRemoteTopic
> > >>>>>>>> > > > > API
> > >>>>>>>> > > > > > request?
> > >>>>>>>> > > > > >
> > >>>>>>>> > > > > > 2. Who will initiate the controller interaction
> > sequence?
> > >>>>>>>> Does the
> > >>>>>>>> > > > > > controller listens for
> > >>>>>>>> > > > > > topic config updates and initiate the disablement?
> > >>>>>>>> > > > > >
> > >>>>>>>> > > > > > --
> > >>>>>>>> > > > > > Kamal
> > >>>>>>>> > > > > >
> > >>>>>>>> > > > > > On Tue, Mar 26, 2024 at 4:40 PM Satish Duggana <
> > >>>>>>>> > > > satish.dugg...@gmail.com
> > >>>>>>>> > > > > >
> > >>>>>>>> > > > > > wrote:
> > >>>>>>>> > > > > >
> > >>>>>>>> > > > > > > Thanks Mehari, Divij, Christo etal for the KIP.
> > >>>>>>>> > > > > > >
> > >>>>>>>> > > > > > > I had an initial review of the KIP and left the
> below
> > >>>>>>>> comments.
> > >>>>>>>> > > > > > >
> > >>>>>>>> > > > > > > 101. For remote.log.disable.policy=delete:
> > >>>>>>>> > > > > > > Does it delete the remote log data immediately and
> the
> > >>>>>>>> data in
> > >>>>>>>> > > remote
> > >>>>>>>> > > > > > > storage will not be taken into account by any
> replica?
> > >>>>>>>> That means
> > >>>>>>>> > > > > > > log-start-offset is moved to the earlier
> > >>>>>>>> local-log-start-offset.
> > >>>>>>>> > > > > > >
> > >>>>>>>> > > > > > > 102. Can we update the remote.log.disable.policy
> after
> > >>>>>>>> tiered
> > >>>>>>>> > > storage
> > >>>>>>>> > > > > > > is disabled on a topic?
> > >>>>>>>> > > > > > >
> > >>>>>>>> > > > > > > 103. Do we plan to add any metrics related to this
> > >>>>>>>> feature?
> > >>>>>>>> > > > > > >
> > >>>>>>>> > > > > > > 104. Please add configuration details about copier
> > >>>>>>>> thread pool,
> > >>>>>>>> > > > > > > expiration thread pool and the migration of the
> > existing
> > >>>>>>>> > > > > > > RemoteLogManagerScheduledThreadPool.
> > >>>>>>>> > > > > > >
> > >>>>>>>> > > > > > > 105. How is the behaviour with topic or partition
> > >>>>>>>> deletion
> > >>>>>>>> > request
> > >>>>>>>> > > > > > > handled when tiered storage disablement request is
> > >>>>>>>> still being
> > >>>>>>>> > > > > > > processed on a topic?
> > >>>>>>>> > > > > > >
> > >>>>>>>> > > > > > > ~Satish.
> > >>>>>>>> > > > > > >
> > >>>>>>>> > > > > > > On Mon, 25 Mar 2024 at 13:34, Doğuşcan Namal <
> > >>>>>>>> > > > namal.dogus...@gmail.com
> > >>>>>>>> > > > > >
> > >>>>>>>> > > > > > > wrote:
> > >>>>>>>> > > > > > > >
> > >>>>>>>> > > > > > > > Hi Christo and Luke,
> > >>>>>>>> > > > > > > >
> > >>>>>>>> > > > > > > > I think the KRaft section of the KIP requires
> slight
> > >>>>>>>> > improvement.
> > >>>>>>>> > > > The
> > >>>>>>>> > > > > > > metadata propagation in KRaft is handled by the RAFT
> > >>>>>>>> layer
> > >>>>>>>> > instead
> > >>>>>>>> > > of
> > >>>>>>>> > > > > > > sending Controller -> Broker RPCs. In fact, KIP-631
> > >>>>>>>> deprecated
> > >>>>>>>> > > these
> > >>>>>>>> > > > > > RPCs.
> > >>>>>>>> > > > > > > >
> > >>>>>>>> > > > > > > > I will come up with some recommendations on how we
> > >>>>>>>> could
> > >>>>>>>> > improve
> > >>>>>>>> > > > that
> > >>>>>>>> > > > > > > one but until then, @Luke please feel free to review
> > >>>>>>>> the KIP.
> > >>>>>>>> > > > > > > >
> > >>>>>>>> > > > > > > > @Satish, if we want this to make it to Kafka 3.8 I
> > >>>>>>>> believe we
> > >>>>>>>> > > need
> > >>>>>>>> > > > to
> > >>>>>>>> > > > > > > aim to get the KIP approved in the following weeks
> > >>>>>>>> otherwise it
> > >>>>>>>> > > will
> > >>>>>>>> > > > > slip
> > >>>>>>>> > > > > > > and we can not support it in Zookeeper mode.
> > >>>>>>>> > > > > > > >
> > >>>>>>>> > > > > > > > I also would like to better understand what is the
> > >>>>>>>> community's
> > >>>>>>>> > > > stand
> > >>>>>>>> > > > > > for
> > >>>>>>>> > > > > > > adding a new feature for Zookeeper since it is
> marked
> > as
> > >>>>>>>> > deprecated
> > >>>>>>>> > > > > > already.
> > >>>>>>>> > > > > > > >
> > >>>>>>>> > > > > > > > Thanks.
> > >>>>>>>> > > > > > > >
> > >>>>>>>> > > > > > > >
> > >>>>>>>> > > > > > > >
> > >>>>>>>> > > > > > > > On Mon, 18 Mar 2024 at 13:42, Christo Lolov <
> > >>>>>>>> > > > christolo...@gmail.com>
> > >>>>>>>> > > > > > > wrote:
> > >>>>>>>> > > > > > > >>
> > >>>>>>>> > > > > > > >> Heya,
> > >>>>>>>> > > > > > > >>
> > >>>>>>>> > > > > > > >> I do have some time to put into this, but to be
> > >>>>>>>> honest I am
> > >>>>>>>> > > still
> > >>>>>>>> > > > > > after
> > >>>>>>>> > > > > > > >> reviews of the KIP itself :)
> > >>>>>>>> > > > > > > >>
> > >>>>>>>> > > > > > > >> After the latest changes it ought to be detailing
> > >>>>>>>> both a
> > >>>>>>>> > > Zookeeper
> > >>>>>>>> > > > > > > approach
> > >>>>>>>> > > > > > > >> and a KRaft approach.
> > >>>>>>>> > > > > > > >>
> > >>>>>>>> > > > > > > >> Do you have any thoughts on how it could be
> > improved
> > >>>>>>>> or
> > >>>>>>>> > should I
> > >>>>>>>> > > > > > start a
> > >>>>>>>> > > > > > > >> voting thread?
> > >>>>>>>> > > > > > > >>
> > >>>>>>>> > > > > > > >> Best,
> > >>>>>>>> > > > > > > >> Christo
> > >>>>>>>> > > > > > > >>
> > >>>>>>>> > > > > > > >> On Thu, 14 Mar 2024 at 06:12, Luke Chen <
> > >>>>>>>> show...@gmail.com>
> > >>>>>>>> > > > wrote:
> > >>>>>>>> > > > > > > >>
> > >>>>>>>> > > > > > > >> > Hi Christo,
> > >>>>>>>> > > > > > > >> >
> > >>>>>>>> > > > > > > >> > Any update with this KIP?
> > >>>>>>>> > > > > > > >> > If you don't have time to complete it, I can
> > >>>>>>>> collaborate
> > >>>>>>>> > with
> > >>>>>>>> > > > you
> > >>>>>>>> > > > > to
> > >>>>>>>> > > > > > > work
> > >>>>>>>> > > > > > > >> > on it.
> > >>>>>>>> > > > > > > >> >
> > >>>>>>>> > > > > > > >> > Thanks.
> > >>>>>>>> > > > > > > >> > Luke
> > >>>>>>>> > > > > > > >> >
> > >>>>>>>> > > > > > > >> > On Wed, Jan 17, 2024 at 11:38 PM Satish
> Duggana <
> > >>>>>>>> > > > > > > satish.dugg...@gmail.com>
> > >>>>>>>> > > > > > > >> > wrote:
> > >>>>>>>> > > > > > > >> >
> > >>>>>>>> > > > > > > >> > > Hi Christo,
> > >>>>>>>> > > > > > > >> > > Thanks for volunteering to contribute to the
> > KIP
> > >>>>>>>> > > discussion. I
> > >>>>>>>> > > > > > > suggest
> > >>>>>>>> > > > > > > >> > > considering this KIP for both ZK and KRaft as
> > it
> > >>>>>>>> will be
> > >>>>>>>> > > > helpful
> > >>>>>>>> > > > > > for
> > >>>>>>>> > > > > > > >> > > this feature to be available in 3.8.0 running
> > >>>>>>>> with ZK
> > >>>>>>>> > > > clusters.
> > >>>>>>>> > > > > > > >> > >
> > >>>>>>>> > > > > > > >> > > Thanks,
> > >>>>>>>> > > > > > > >> > > Satish.
> > >>>>>>>> > > > > > > >> > >
> > >>>>>>>> > > > > > > >> > > On Wed, 17 Jan 2024 at 19:04, Christo Lolov <
> > >>>>>>>> > > > > > christolo...@gmail.com
> > >>>>>>>> > > > > > > >
> > >>>>>>>> > > > > > > >> > > wrote:
> > >>>>>>>> > > > > > > >> > > >
> > >>>>>>>> > > > > > > >> > > > Hello!
> > >>>>>>>> > > > > > > >> > > >
> > >>>>>>>> > > > > > > >> > > > I volunteer to get this KIP moving forward
> > and
> > >>>>>>>> > implemented
> > >>>>>>>> > > > in
> > >>>>>>>> > > > > > > Apache
> > >>>>>>>> > > > > > > >> > > Kafka
> > >>>>>>>> > > > > > > >> > > > 3.8.
> > >>>>>>>> > > > > > > >> > > >
> > >>>>>>>> > > > > > > >> > > > I have caught up with Mehari offline and we
> > >>>>>>>> have agreed
> > >>>>>>>> > > that
> > >>>>>>>> > > > > > given
> > >>>>>>>> > > > > > > >> > Apache
> > >>>>>>>> > > > > > > >> > > > Kafka 4.0 being around the corner we would
> > >>>>>>>> like to
> > >>>>>>>> > propose
> > >>>>>>>> > > > > this
> > >>>>>>>> > > > > > > feature
> > >>>>>>>> > > > > > > >> > > > only for KRaft clusters.
> > >>>>>>>> > > > > > > >> > > >
> > >>>>>>>> > > > > > > >> > > > Any and all reviews and comments are
> welcome!
> > >>>>>>>> > > > > > > >> > > >
> > >>>>>>>> > > > > > > >> > > > Best,
> > >>>>>>>> > > > > > > >> > > > Christo
> > >>>>>>>> > > > > > > >> > > >
> > >>>>>>>> > > > > > > >> > > > On Tue, 9 Jan 2024 at 09:44, Doğuşcan
> Namal <
> > >>>>>>>> > > > > > > namal.dogus...@gmail.com>
> > >>>>>>>> > > > > > > >> > > > wrote:
> > >>>>>>>> > > > > > > >> > > >
> > >>>>>>>> > > > > > > >> > > > > Hi everyone, any progress on the status
> of
> > >>>>>>>> this KIP?
> > >>>>>>>> > > > Overall
> > >>>>>>>> > > > > > > looks
> > >>>>>>>> > > > > > > >> > > good to
> > >>>>>>>> > > > > > > >> > > > > me but I wonder whether we still need to
> > >>>>>>>> support it
> > >>>>>>>> > for
> > >>>>>>>> > > > > > > Zookeeper
> > >>>>>>>> > > > > > > >> > mode
> > >>>>>>>> > > > > > > >> > > > > given that it will be deprecated in the
> > next
> > >>>>>>>> 3 months.
> > >>>>>>>> > > > > > > >> > > > >
> > >>>>>>>> > > > > > > >> > > > > On 2023/07/21 20:16:46 "Beyene, Mehari"
> > >>>>>>>> wrote:
> > >>>>>>>> > > > > > > >> > > > > > Hi everyone,
> > >>>>>>>> > > > > > > >> > > > > > I would like to start a discussion on
> > >>>>>>>> KIP-950:
> > >>>>>>>> > Tiered
> > >>>>>>>> > > > > > Storage
> > >>>>>>>> > > > > > > >> > > Disablement
> > >>>>>>>> > > > > > > >> > > > > (
> > >>>>>>>> > > > > > > >> > > > >
> > >>>>>>>> > > > > > > >> > > > >
> > >>>>>>>> > > > > > > >> > >
> > >>>>>>>> > > > > > > >> >
> > >>>>>>>> > > > > > >
> > >>>>>>>> > > > > >
> > >>>>>>>> > > > >
> > >>>>>>>> > > >
> > >>>>>>>> > >
> > >>>>>>>> >
> > >>>>>>>>
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-950%3A++Tiered+Storage+Disablement
> > >>>>>>>> > > > > > > >> > > > > ).
> > >>>>>>>> > > > > > > >> > > > > >
> > >>>>>>>> > > > > > > >> > > > > > This KIP proposes adding the ability to
> > >>>>>>>> disable and
> > >>>>>>>> > > > > > re-enable
> > >>>>>>>> > > > > > > >> > tiered
> > >>>>>>>> > > > > > > >> > > > > storage on a topic.
> > >>>>>>>> > > > > > > >> > > > > >
> > >>>>>>>> > > > > > > >> > > > > > Thanks,
> > >>>>>>>> > > > > > > >> > > > > > Mehari
> > >>>>>>>> > > > > > > >> > > > > >
> > >>>>>>>> > > > > > > >> > > > >
> > >>>>>>>> > > > > > > >> > >
> > >>>>>>>> > > > > > > >> >
> > >>>>>>>> > > > > > >
> > >>>>>>>> > > > > >
> > >>>>>>>> > > > >
> > >>>>>>>> > > >
> > >>>>>>>> > >
> > >>>>>>>> >
> > >>>>>>>>
> > >>>>>>>
> >
>

Re: [DISCUSS] KIP-950: Tiered Storage Disablement

Reply via email to