Luke,

Thanks for the review!

DelayedFetch and DelayedRemoteFetch are orthogonal calls
<https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/server/ReplicaManager.scala#L1559>
.
Only one of them will be active in a given FETCH request.

The fetch request might take more than 500 ms when the time
taken to read data from remote storage exceeds 500 ms and
`remote.fetch.max.wait.ms` is configured higher than 500 ms.

--
Kamal


On Fri, May 3, 2024 at 1:55 PM Luke Chen <show...@gmail.com> wrote:

> Hi Kamal,
>
> Thanks for the KIP!
> Sorry for the late review.
>
> Overall LGTM! Just 1 question:
>
> If one fetch request contains 2 partitions: [p1, p2]
> fetch.max.wait.ms: 500, remote.fetch.max.wait.ms: 1000
>
> And now, p1 fetch offset is the log end offset and has no new data coming,
> and p2 fetch offset is to fetch from remote storage.
> And suppose the fetch from remote storage takes 1000ms.
> So, question:
> Will this fetch request return in 500ms or 1000ms?
> And what will be returned?
>
> I think before this change, it'll return within 500ms, right?
> But it's not clear what behavior it will be after this KIP.
>
> Thank you.
> Luke
>
>
> On Fri, May 3, 2024 at 1:56 PM Kamal Chandraprakash <
> kamal.chandraprak...@gmail.com> wrote:
>
> > Christo,
> >
> > We have localTimeMs, remoteTimeMs, and totalTimeMs as part of the
> > FetchConsumer request metric.
> >
> >
> >
> kafka.network:type=RequestMetrics,name={LocalTimeMs|RemoteTimeMs|TotalTimeMs},request={Produce|FetchConsumer|FetchFollower}
> >
> > RemoteTimeMs refers to the amount of time spent in the purgatory for
> normal
> > fetch requests
> > and amount of time spent in reading the remote data for remote-fetch
> > requests. Do we want
> > to have a separate `TieredStorageTimeMs` to capture the time spent in
> > remote-read requests?
> >
> > With per-broker level timer metrics combined with the request level
> > metrics, the user will have
> > sufficient information.
> >
> > Metric name =
> >
> >
> kafka.log.remote:type=RemoteLogManager,name=RemoteLogReaderFetchRateAndTimeMs
> >
> > --
> > Kamal
> >
> > On Mon, Apr 29, 2024 at 1:38 PM Christo Lolov <christolo...@gmail.com>
> > wrote:
> >
> > > Heya!
> > >
> > > Is it difficult to instead add the metric at
> > > kafka.network:type=RequestMetrics,name=TieredStorageMs (or some other
> > > name=*)? Alternatively, if it is difficult to add it there, is it
> > possible
> > > to add 2 metrics, one at the RequestMetrics level (even if it is
> > > total-time-ms - (all other times)) and one at what you are proposing?
> As
> > an
> > > operator I would find it strange to not see the metric in the
> > > RequestMetrics.
> > >
> > > Your thoughts?
> > >
> > > Best,
> > > Christo
> > >
> > > On Sun, 28 Apr 2024 at 10:52, Kamal Chandraprakash <
> > > kamal.chandraprak...@gmail.com> wrote:
> > >
> > > > Christo,
> > > >
> > > > Updated the KIP with the remote fetch latency metric. Please take
> > another
> > > > look!
> > > >
> > > > --
> > > > Kamal
> > > >
> > > > On Sun, Apr 28, 2024 at 12:23 PM Kamal Chandraprakash <
> > > > kamal.chandraprak...@gmail.com> wrote:
> > > >
> > > > > Hi Federico,
> > > > >
> > > > > Thanks for the suggestion! Updated the config name to "
> > > > > remote.fetch.max.wait.ms".
> > > > >
> > > > > Christo,
> > > > >
> > > > > Good point. We don't have the remote-read latency metrics to
> measure
> > > the
> > > > > performance of the remote read requests. I'll update the KIP to
> emit
> > > this
> > > > > metric.
> > > > >
> > > > > --
> > > > > Kamal
> > > > >
> > > > >
> > > > > On Sat, Apr 27, 2024 at 4:03 PM Federico Valeri <
> > fedeval...@gmail.com>
> > > > > wrote:
> > > > >
> > > > >> Hi Kamal, it looks like all TS configurations starts with
> "remote."
> > > > >> prefix, so I was wondering if we should name it
> > > > >> "remote.fetch.max.wait.ms".
> > > > >>
> > > > >> On Fri, Apr 26, 2024 at 7:07 PM Kamal Chandraprakash
> > > > >> <kamal.chandraprak...@gmail.com> wrote:
> > > > >> >
> > > > >> > Hi all,
> > > > >> >
> > > > >> > If there are no more comments, I'll start a vote thread by
> > tomorrow.
> > > > >> > Please review the KIP.
> > > > >> >
> > > > >> > Thanks,
> > > > >> > Kamal
> > > > >> >
> > > > >> > On Sat, Mar 30, 2024 at 11:08 PM Kamal Chandraprakash <
> > > > >> > kamal.chandraprak...@gmail.com> wrote:
> > > > >> >
> > > > >> > > Hi all,
> > > > >> > >
> > > > >> > > Bumping the thread. Please review this KIP. Thanks!
> > > > >> > >
> > > > >> > > On Thu, Feb 1, 2024 at 9:11 PM Kamal Chandraprakash <
> > > > >> > > kamal.chandraprak...@gmail.com> wrote:
> > > > >> > >
> > > > >> > >> Hi Jorge,
> > > > >> > >>
> > > > >> > >> Thanks for the review! Added your suggestions to the KIP.
> PTAL.
> > > > >> > >>
> > > > >> > >> The `fetch.max.wait.ms` config will be also applicable for
> > > topics
> > > > >> > >> enabled with remote storage.
> > > > >> > >> Updated the description to:
> > > > >> > >>
> > > > >> > >> ```
> > > > >> > >> The maximum amount of time the server will block before
> > answering
> > > > the
> > > > >> > >> fetch request
> > > > >> > >> when it is reading near to the tail of the partition
> > > > >> (high-watermark) and
> > > > >> > >> there isn't
> > > > >> > >> sufficient data to immediately satisfy the requirement given
> by
> > > > >> > >> fetch.min.bytes.
> > > > >> > >> ```
> > > > >> > >>
> > > > >> > >> --
> > > > >> > >> Kamal
> > > > >> > >>
> > > > >> > >> On Thu, Feb 1, 2024 at 12:12 AM Jorge Esteban Quilcate Otoya
> <
> > > > >> > >> quilcate.jo...@gmail.com> wrote:
> > > > >> > >>
> > > > >> > >>> Hi Kamal,
> > > > >> > >>>
> > > > >> > >>> Thanks for this KIP! It should help to solve one of the main
> > > > issues
> > > > >> with
> > > > >> > >>> tiered storage at the moment that is dealing with individual
> > > > >> consumer
> > > > >> > >>> configurations to avoid flooding logs with interrupted
> > > exceptions.
> > > > >> > >>>
> > > > >> > >>> One of the topics discussed in [1][2] was on the semantics
> of
> > `
> > > > >> > >>> fetch.max.wait.ms` and how it's affected by remote storage.
> > > > Should
> > > > >> we
> > > > >> > >>> consider within this KIP the update of `fetch.max.wail.ms`
> > docs
> > > > to
> > > > >> > >>> clarify
> > > > >> > >>> it only applies to local storage?
> > > > >> > >>>
> > > > >> > >>> Otherwise, LGTM -- looking forward to see this KIP adopted.
> > > > >> > >>>
> > > > >> > >>> [1] https://issues.apache.org/jira/browse/KAFKA-15776
> > > > >> > >>> [2]
> > > > >>
> https://github.com/apache/kafka/pull/14778#issuecomment-1820588080
> > > > >> > >>>
> > > > >> > >>> On Tue, 30 Jan 2024 at 01:01, Kamal Chandraprakash <
> > > > >> > >>> kamal.chandraprak...@gmail.com> wrote:
> > > > >> > >>>
> > > > >> > >>> > Hi all,
> > > > >> > >>> >
> > > > >> > >>> > I have opened a KIP-1018
> > > > >> > >>> > <
> > > > >> > >>> >
> > > > >> > >>>
> > > > >>
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1018%3A+Introduce+max+remote+fetch+timeout+config+for+DelayedRemoteFetch+requests
> > > > >> > >>> > >
> > > > >> > >>> > to introduce dynamic max-remote-fetch-timeout broker
> config
> > to
> > > > >> give
> > > > >> > >>> more
> > > > >> > >>> > control to the operator.
> > > > >> > >>> >
> > > > >> > >>> >
> > > > >> > >>> >
> > > > >> > >>>
> > > > >>
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1018%3A+Introduce+max+remote+fetch+timeout+config+for+DelayedRemoteFetch+requests
> > > > >> > >>> >
> > > > >> > >>> > Let me know if you have any feedback or suggestions.
> > > > >> > >>> >
> > > > >> > >>> > --
> > > > >> > >>> > Kamal
> > > > >> > >>> >
> > > > >> > >>>
> > > > >> > >>
> > > > >>
> > > > >
> > > >
> > >
> >
>

Reply via email to