Hi Christo,

Please find my comments inline.

On Fri, Apr 5, 2024 at 12:36 PM Christo Lolov <christolo...@gmail.com>
wrote:

> Hello Abhijeet and Jun,
>
> I have been mulling this KIP over a bit more in recent days!
>
> re: Jun
>
> I wasn't aware we apply 2.1 and 2.2 for reserving new timestamps - in
> retrospect it should have been fairly obvious. I would need to go an update
> KIP-1005 myself then, thank you for giving the useful reference!
>
> 4. I think Abhijeet wants to rebuild state from latest-tiered-offset and
> fetch from latest-tiered-offset + 1 only for new replicas (or replicas
> which experienced a disk failure) to decrease the time a partition spends
> in under-replicated state. In other words, a follower which has just fallen
> out of ISR, but has local data will continue using today's Tiered Storage
> replication protocol (i.e. fetching from earliest-local). I further believe
> he has taken this approach so that local state of replicas which have just
> fallen out of ISR isn't forcefully wiped thus leading to situation 1.
> Abhijeet, have I understood (and summarised) what you are proposing
> correctly?
>
> Yes, your understanding is correct. We want to limit the behavior changes
only to new replicas.


> 5. I think in today's Tiered Storage we know the leader's log-start-offset
> from the FetchResponse and we can learn its local-log-start-offset from the
> ListOffsets by asking for earliest-local timestamp (-4). But granted, this
> ought to be added as an additional API call in the KIP.
>
>
Yes, I clarified this in my reply to Jun. I will add this missing detail in
the KIP.


> re: Abhijeet
>
> 101. I am still a bit confused as to why you want to include a new offset
> (i.e. pending-upload-offset) when you yourself mention that you could use
> an already existing offset (i.e. last-tiered-offset + 1). In essence, you
> end your Motivation with "In this KIP, we will focus only on the follower
> fetch protocol using the *last-tiered-offset*" and then in the following
> sections you talk about pending-upload-offset. I understand this might be
> classified as an implementation detail, but if you introduce a new offset
> (i.e. pending-upload-offset) you have to make a change to the ListOffsets
> API (i.e. introduce -6) and thus document it in this KIP as such. However,
> the last-tiered-offset ought to already be exposed as part of KIP-1005
> (under implementation). Am I misunderstanding something here?
>

I have tried to clarify this in my reply to Jun.

> The follower needs to build the local data starting from the offset
> Earliest-Pending-Upload-Offset. Hence it needs the offset and the
> corresponding leader-epoch.
> There are two ways to do this:
>    1. We add support in ListOffsetRequest to be able to fetch this offset
> (and leader epoch) from the leader.
>    2. Or, fetch the tiered-offset (which is already supported). From this
> offset, we can get the Earliest-Pending-Upload-Offset. We can just add 1 to
> the tiered-offset.
>       However, we still need the leader epoch for offset, since there is
> no guarantee that the leader epoch for Earliest-Pending-Upload-Offset will
> be the same as the
>       leader epoch for tiered-offset. We may need another API call to the
> leader for this.
> I prefer the first approach. The only problem with the first approach is
> that it introduces one more offset. The second approach avoids this problem
> but is a little complicated.



>
> Best,
> Christo
>
> On Thu, 4 Apr 2024 at 19:37, Jun Rao <j...@confluent.io.invalid> wrote:
>
> > Hi, Abhijeet,
> >
> > Thanks for the KIP. Left a few comments.
> >
> > 1. "A drawback of using the last-tiered-offset is that this new follower
> > would possess only a limited number of locally stored segments. Should it
> > ascend to the role of leader, there is a risk of needing to fetch these
> > segments from the remote storage, potentially impacting broker
> > performance."
> > Since we support consumers fetching from followers, this is a potential
> > issue on the follower side too. In theory, it's possible for a segment to
> > be tiered immediately after rolling. In that case, there could be very
> > little data after last-tiered-offset. It would be useful to think through
> > how to address this issue.
> >
> > 2. ListOffsetsRequest:
> > 2.1 Typically, we need to bump up the version of the request if we add a
> > new value for timestamp. See
> >
> >
> https://github.com/apache/kafka/pull/10760/files#diff-fac7080d67da905a80126d58fc1745c9a1409de7ef7d093c2ac66a888b134633
> > .
> > 2.2 Since this changes the inter broker request protocol, it would be
> > useful to have a section on upgrade (e.g. new IBP/metadata.version).
> >
> > 3. "Instead of fetching Earliest-Pending-Upload-Offset, it could fetch
> the
> > last-tiered-offset from the leader, and make a separate leader call to
> > fetch leader epoch for the following offset."
> > Why do we need to make a separate call for the leader epoch?
> > ListOffsetsResponse include both the offset and the corresponding epoch.
> >
> > 4. "Check if the follower replica is empty and if the feature to use
> > last-tiered-offset is enabled."
> > Why do we need to check if the follower replica is empty?
> >
> > 5. It can be confirmed by checking if the leader's Log-Start-Offset is
> the
> > same as the Leader's Local-Log-Start-Offset.
> > How does the follower know Local-Log-Start-Offset?
> >
> > Jun
> >
> > On Sat, Mar 30, 2024 at 5:51 AM Abhijeet Kumar <
> abhijeet.cse....@gmail.com
> > >
> > wrote:
> >
> > > Hi Christo,
> > >
> > > Thanks for reviewing the KIP.
> > >
> > > The follower needs the earliest-pending-upload-offset (and the
> > > corresponding leader epoch) from the leader.
> > > This is the first offset the follower will have locally.
> > >
> > > Regards,
> > > Abhijeet.
> > >
> > >
> > >
> > > On Fri, Mar 29, 2024 at 1:14 PM Christo Lolov <christolo...@gmail.com>
> > > wrote:
> > >
> > > > Heya!
> > > >
> > > > First of all, thank you very much for the proposal, you have
> explained
> > > the
> > > > problem you want solved very well - I think a faster bootstrap of an
> > > empty
> > > > replica is definitely an improvement!
> > > >
> > > > For my understanding, which concrete offset do you want the leader to
> > > give
> > > > back to a follower - earliest-pending-upload-offset or the
> > > > latest-tiered-offset? If it is the second, then I believe KIP-1005
> > ought
> > > to
> > > > already be exposing that offset as part of the ListOffsets API, no?
> > > >
> > > > Best,
> > > > Christo
> > > >
> > > > On Wed, 27 Mar 2024 at 18:23, Abhijeet Kumar <
> > abhijeet.cse....@gmail.com
> > > >
> > > > wrote:
> > > >
> > > > > Hi All,
> > > > >
> > > > > I have created KIP-1023 to introduce follower fetch from tiered
> > offset.
> > > > > This feature will be helpful in significantly reducing Kafka
> > > > > rebalance/rebuild times when the cluster is enabled with tiered
> > > storage.
> > > > >
> > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1023%3A+Follower+fetch+from+tiered+offset
> > > > >
> > > > > Feedback and suggestions are welcome.
> > > > >
> > > > > Regards,
> > > > > Abhijeet.
> > > > >
> > > >
> > >
> >
>

Reply via email to