Hi, Abhijeet,

Thanks for the reply.

1. I am wondering if we could achieve the same result by just lowering
local.retention.ms and local.retention.bytes. This also allows the newly
started follower to build up the local data before serving the consumer
traffic.

2. Have you updated the KIP?

Thanks,

Jun

On Tue, Apr 9, 2024 at 3:36 AM Satish Duggana <satish.dugg...@gmail.com>
wrote:

> +1 to Jun for adding the consumer fetching from a follower scenario
> also to the existing section that talked about the drawback when a
> node built with last-tiered-offset has become a leader. As Abhijeet
> mentioned, we plan to have a follow-up KIP that will address these by
> having a deprioritzation of these brokers. The deprioritization of
> those brokers can be removed once they catchup until the local log
> retention.
>
> Thanks,
> Satish.
>
> On Tue, 9 Apr 2024 at 14:08, Luke Chen <show...@gmail.com> wrote:
> >
> > Hi Abhijeet,
> >
> > Thanks for the KIP to improve the tiered storage feature!
> >
> > Questions:
> > 1. We could also get the "pending-upload-offset" and epoch via remote log
> > metadata, instead of adding a new API to fetch from the leader. Could you
> > explain why you choose the later approach, instead of the former?
> > 2.
> > > We plan to have a follow-up KIP that will address both the
> > deprioritization
> > of these brokers from leadership, as well as
> > for consumption (when fetching from followers is allowed).
> >
> > I agree with Jun that we might need to make it clear all possible
> drawbacks
> > that could have. So, could we add the drawbacks that Jun mentioned about
> > the performance issue when consumer fetch from follower?
> >
> > 3. Could we add "Rejected Alternatives" section to the end of the KIP to
> > add some of them?
> > Like the "ListOffsetRequest" approach VS "Earliest-Pending-Upload-Offset"
> > approach, or getting the "Earliest-Pending-Upload-Offset" from remote log
> > metadata... etc.
> >
> > Thanks.
> > Luke
> >
> >
> > On Tue, Apr 9, 2024 at 2:25 PM Abhijeet Kumar <
> abhijeet.cse....@gmail.com>
> > wrote:
> >
> > > Hi Christo,
> > >
> > > Please find my comments inline.
> > >
> > > On Fri, Apr 5, 2024 at 12:36 PM Christo Lolov <christolo...@gmail.com>
> > > wrote:
> > >
> > > > Hello Abhijeet and Jun,
> > > >
> > > > I have been mulling this KIP over a bit more in recent days!
> > > >
> > > > re: Jun
> > > >
> > > > I wasn't aware we apply 2.1 and 2.2 for reserving new timestamps - in
> > > > retrospect it should have been fairly obvious. I would need to go an
> > > update
> > > > KIP-1005 myself then, thank you for giving the useful reference!
> > > >
> > > > 4. I think Abhijeet wants to rebuild state from latest-tiered-offset
> and
> > > > fetch from latest-tiered-offset + 1 only for new replicas (or
> replicas
> > > > which experienced a disk failure) to decrease the time a partition
> spends
> > > > in under-replicated state. In other words, a follower which has just
> > > fallen
> > > > out of ISR, but has local data will continue using today's Tiered
> Storage
> > > > replication protocol (i.e. fetching from earliest-local). I further
> > > believe
> > > > he has taken this approach so that local state of replicas which have
> > > just
> > > > fallen out of ISR isn't forcefully wiped thus leading to situation 1.
> > > > Abhijeet, have I understood (and summarised) what you are proposing
> > > > correctly?
> > > >
> > > > Yes, your understanding is correct. We want to limit the behavior
> changes
> > > only to new replicas.
> > >
> > >
> > > > 5. I think in today's Tiered Storage we know the leader's
> > > log-start-offset
> > > > from the FetchResponse and we can learn its local-log-start-offset
> from
> > > the
> > > > ListOffsets by asking for earliest-local timestamp (-4). But granted,
> > > this
> > > > ought to be added as an additional API call in the KIP.
> > > >
> > > >
> > > Yes, I clarified this in my reply to Jun. I will add this missing
> detail in
> > > the KIP.
> > >
> > >
> > > > re: Abhijeet
> > > >
> > > > 101. I am still a bit confused as to why you want to include a new
> offset
> > > > (i.e. pending-upload-offset) when you yourself mention that you
> could use
> > > > an already existing offset (i.e. last-tiered-offset + 1). In
> essence, you
> > > > end your Motivation with "In this KIP, we will focus only on the
> follower
> > > > fetch protocol using the *last-tiered-offset*" and then in the
> following
> > > > sections you talk about pending-upload-offset. I understand this
> might be
> > > > classified as an implementation detail, but if you introduce a new
> offset
> > > > (i.e. pending-upload-offset) you have to make a change to the
> ListOffsets
> > > > API (i.e. introduce -6) and thus document it in this KIP as such.
> > > However,
> > > > the last-tiered-offset ought to already be exposed as part of
> KIP-1005
> > > > (under implementation). Am I misunderstanding something here?
> > > >
> > >
> > > I have tried to clarify this in my reply to Jun.
> > >
> > > > The follower needs to build the local data starting from the offset
> > > > Earliest-Pending-Upload-Offset. Hence it needs the offset and the
> > > > corresponding leader-epoch.
> > > > There are two ways to do this:
> > > >    1. We add support in ListOffsetRequest to be able to fetch this
> offset
> > > > (and leader epoch) from the leader.
> > > >    2. Or, fetch the tiered-offset (which is already supported). From
> this
> > > > offset, we can get the Earliest-Pending-Upload-Offset. We can just
> add 1
> > > to
> > > > the tiered-offset.
> > > >       However, we still need the leader epoch for offset, since
> there is
> > > > no guarantee that the leader epoch for Earliest-Pending-Upload-Offset
> > > will
> > > > be the same as the
> > > >       leader epoch for tiered-offset. We may need another API call
> to the
> > > > leader for this.
> > > > I prefer the first approach. The only problem with the first
> approach is
> > > > that it introduces one more offset. The second approach avoids this
> > > problem
> > > > but is a little complicated.
> > >
> > >
> > >
> > > >
> > > > Best,
> > > > Christo
> > > >
> > > > On Thu, 4 Apr 2024 at 19:37, Jun Rao <j...@confluent.io.invalid>
> wrote:
> > > >
> > > > > Hi, Abhijeet,
> > > > >
> > > > > Thanks for the KIP. Left a few comments.
> > > > >
> > > > > 1. "A drawback of using the last-tiered-offset is that this new
> > > follower
> > > > > would possess only a limited number of locally stored segments.
> Should
> > > it
> > > > > ascend to the role of leader, there is a risk of needing to fetch
> these
> > > > > segments from the remote storage, potentially impacting broker
> > > > > performance."
> > > > > Since we support consumers fetching from followers, this is a
> potential
> > > > > issue on the follower side too. In theory, it's possible for a
> segment
> > > to
> > > > > be tiered immediately after rolling. In that case, there could be
> very
> > > > > little data after last-tiered-offset. It would be useful to think
> > > through
> > > > > how to address this issue.
> > > > >
> > > > > 2. ListOffsetsRequest:
> > > > > 2.1 Typically, we need to bump up the version of the request if we
> add
> > > a
> > > > > new value for timestamp. See
> > > > >
> > > > >
> > > >
> > >
> https://github.com/apache/kafka/pull/10760/files#diff-fac7080d67da905a80126d58fc1745c9a1409de7ef7d093c2ac66a888b134633
> > > > > .
> > > > > 2.2 Since this changes the inter broker request protocol, it would
> be
> > > > > useful to have a section on upgrade (e.g. new
> IBP/metadata.version).
> > > > >
> > > > > 3. "Instead of fetching Earliest-Pending-Upload-Offset, it could
> fetch
> > > > the
> > > > > last-tiered-offset from the leader, and make a separate leader
> call to
> > > > > fetch leader epoch for the following offset."
> > > > > Why do we need to make a separate call for the leader epoch?
> > > > > ListOffsetsResponse include both the offset and the corresponding
> > > epoch.
> > > > >
> > > > > 4. "Check if the follower replica is empty and if the feature to
> use
> > > > > last-tiered-offset is enabled."
> > > > > Why do we need to check if the follower replica is empty?
> > > > >
> > > > > 5. It can be confirmed by checking if the leader's
> Log-Start-Offset is
> > > > the
> > > > > same as the Leader's Local-Log-Start-Offset.
> > > > > How does the follower know Local-Log-Start-Offset?
> > > > >
> > > > > Jun
> > > > >
> > > > > On Sat, Mar 30, 2024 at 5:51 AM Abhijeet Kumar <
> > > > abhijeet.cse....@gmail.com
> > > > > >
> > > > > wrote:
> > > > >
> > > > > > Hi Christo,
> > > > > >
> > > > > > Thanks for reviewing the KIP.
> > > > > >
> > > > > > The follower needs the earliest-pending-upload-offset (and the
> > > > > > corresponding leader epoch) from the leader.
> > > > > > This is the first offset the follower will have locally.
> > > > > >
> > > > > > Regards,
> > > > > > Abhijeet.
> > > > > >
> > > > > >
> > > > > >
> > > > > > On Fri, Mar 29, 2024 at 1:14 PM Christo Lolov <
> > > christolo...@gmail.com>
> > > > > > wrote:
> > > > > >
> > > > > > > Heya!
> > > > > > >
> > > > > > > First of all, thank you very much for the proposal, you have
> > > > explained
> > > > > > the
> > > > > > > problem you want solved very well - I think a faster bootstrap
> of
> > > an
> > > > > > empty
> > > > > > > replica is definitely an improvement!
> > > > > > >
> > > > > > > For my understanding, which concrete offset do you want the
> leader
> > > to
> > > > > > give
> > > > > > > back to a follower - earliest-pending-upload-offset or the
> > > > > > > latest-tiered-offset? If it is the second, then I believe
> KIP-1005
> > > > > ought
> > > > > > to
> > > > > > > already be exposing that offset as part of the ListOffsets
> API, no?
> > > > > > >
> > > > > > > Best,
> > > > > > > Christo
> > > > > > >
> > > > > > > On Wed, 27 Mar 2024 at 18:23, Abhijeet Kumar <
> > > > > abhijeet.cse....@gmail.com
> > > > > > >
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Hi All,
> > > > > > > >
> > > > > > > > I have created KIP-1023 to introduce follower fetch from
> tiered
> > > > > offset.
> > > > > > > > This feature will be helpful in significantly reducing Kafka
> > > > > > > > rebalance/rebuild times when the cluster is enabled with
> tiered
> > > > > > storage.
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1023%3A+Follower+fetch+from+tiered+offset
> > > > > > > >
> > > > > > > > Feedback and suggestions are welcome.
> > > > > > > >
> > > > > > > > Regards,
> > > > > > > > Abhijeet.
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
>

Reply via email to