Christo,

We have localTimeMs, remoteTimeMs, and totalTimeMs as part of the
FetchConsumer request metric.

kafka.network:type=RequestMetrics,name={LocalTimeMs|RemoteTimeMs|TotalTimeMs},request={Produce|FetchConsumer|FetchFollower}

RemoteTimeMs refers to the amount of time spent in the purgatory for normal
fetch requests
and amount of time spent in reading the remote data for remote-fetch
requests. Do we want
to have a separate `TieredStorageTimeMs` to capture the time spent in
remote-read requests?

With per-broker level timer metrics combined with the request level
metrics, the user will have
sufficient information.

Metric name =
kafka.log.remote:type=RemoteLogManager,name=RemoteLogReaderFetchRateAndTimeMs

--
Kamal

On Mon, Apr 29, 2024 at 1:38 PM Christo Lolov <christolo...@gmail.com>
wrote:

> Heya!
>
> Is it difficult to instead add the metric at
> kafka.network:type=RequestMetrics,name=TieredStorageMs (or some other
> name=*)? Alternatively, if it is difficult to add it there, is it possible
> to add 2 metrics, one at the RequestMetrics level (even if it is
> total-time-ms - (all other times)) and one at what you are proposing? As an
> operator I would find it strange to not see the metric in the
> RequestMetrics.
>
> Your thoughts?
>
> Best,
> Christo
>
> On Sun, 28 Apr 2024 at 10:52, Kamal Chandraprakash <
> kamal.chandraprak...@gmail.com> wrote:
>
> > Christo,
> >
> > Updated the KIP with the remote fetch latency metric. Please take another
> > look!
> >
> > --
> > Kamal
> >
> > On Sun, Apr 28, 2024 at 12:23 PM Kamal Chandraprakash <
> > kamal.chandraprak...@gmail.com> wrote:
> >
> > > Hi Federico,
> > >
> > > Thanks for the suggestion! Updated the config name to "
> > > remote.fetch.max.wait.ms".
> > >
> > > Christo,
> > >
> > > Good point. We don't have the remote-read latency metrics to measure
> the
> > > performance of the remote read requests. I'll update the KIP to emit
> this
> > > metric.
> > >
> > > --
> > > Kamal
> > >
> > >
> > > On Sat, Apr 27, 2024 at 4:03 PM Federico Valeri <fedeval...@gmail.com>
> > > wrote:
> > >
> > >> Hi Kamal, it looks like all TS configurations starts with "remote."
> > >> prefix, so I was wondering if we should name it
> > >> "remote.fetch.max.wait.ms".
> > >>
> > >> On Fri, Apr 26, 2024 at 7:07 PM Kamal Chandraprakash
> > >> <kamal.chandraprak...@gmail.com> wrote:
> > >> >
> > >> > Hi all,
> > >> >
> > >> > If there are no more comments, I'll start a vote thread by tomorrow.
> > >> > Please review the KIP.
> > >> >
> > >> > Thanks,
> > >> > Kamal
> > >> >
> > >> > On Sat, Mar 30, 2024 at 11:08 PM Kamal Chandraprakash <
> > >> > kamal.chandraprak...@gmail.com> wrote:
> > >> >
> > >> > > Hi all,
> > >> > >
> > >> > > Bumping the thread. Please review this KIP. Thanks!
> > >> > >
> > >> > > On Thu, Feb 1, 2024 at 9:11 PM Kamal Chandraprakash <
> > >> > > kamal.chandraprak...@gmail.com> wrote:
> > >> > >
> > >> > >> Hi Jorge,
> > >> > >>
> > >> > >> Thanks for the review! Added your suggestions to the KIP. PTAL.
> > >> > >>
> > >> > >> The `fetch.max.wait.ms` config will be also applicable for
> topics
> > >> > >> enabled with remote storage.
> > >> > >> Updated the description to:
> > >> > >>
> > >> > >> ```
> > >> > >> The maximum amount of time the server will block before answering
> > the
> > >> > >> fetch request
> > >> > >> when it is reading near to the tail of the partition
> > >> (high-watermark) and
> > >> > >> there isn't
> > >> > >> sufficient data to immediately satisfy the requirement given by
> > >> > >> fetch.min.bytes.
> > >> > >> ```
> > >> > >>
> > >> > >> --
> > >> > >> Kamal
> > >> > >>
> > >> > >> On Thu, Feb 1, 2024 at 12:12 AM Jorge Esteban Quilcate Otoya <
> > >> > >> quilcate.jo...@gmail.com> wrote:
> > >> > >>
> > >> > >>> Hi Kamal,
> > >> > >>>
> > >> > >>> Thanks for this KIP! It should help to solve one of the main
> > issues
> > >> with
> > >> > >>> tiered storage at the moment that is dealing with individual
> > >> consumer
> > >> > >>> configurations to avoid flooding logs with interrupted
> exceptions.
> > >> > >>>
> > >> > >>> One of the topics discussed in [1][2] was on the semantics of `
> > >> > >>> fetch.max.wait.ms` and how it's affected by remote storage.
> > Should
> > >> we
> > >> > >>> consider within this KIP the update of `fetch.max.wail.ms` docs
> > to
> > >> > >>> clarify
> > >> > >>> it only applies to local storage?
> > >> > >>>
> > >> > >>> Otherwise, LGTM -- looking forward to see this KIP adopted.
> > >> > >>>
> > >> > >>> [1] https://issues.apache.org/jira/browse/KAFKA-15776
> > >> > >>> [2]
> > >> https://github.com/apache/kafka/pull/14778#issuecomment-1820588080
> > >> > >>>
> > >> > >>> On Tue, 30 Jan 2024 at 01:01, Kamal Chandraprakash <
> > >> > >>> kamal.chandraprak...@gmail.com> wrote:
> > >> > >>>
> > >> > >>> > Hi all,
> > >> > >>> >
> > >> > >>> > I have opened a KIP-1018
> > >> > >>> > <
> > >> > >>> >
> > >> > >>>
> > >>
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1018%3A+Introduce+max+remote+fetch+timeout+config+for+DelayedRemoteFetch+requests
> > >> > >>> > >
> > >> > >>> > to introduce dynamic max-remote-fetch-timeout broker config to
> > >> give
> > >> > >>> more
> > >> > >>> > control to the operator.
> > >> > >>> >
> > >> > >>> >
> > >> > >>> >
> > >> > >>>
> > >>
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1018%3A+Introduce+max+remote+fetch+timeout+config+for+DelayedRemoteFetch+requests
> > >> > >>> >
> > >> > >>> > Let me know if you have any feedback or suggestions.
> > >> > >>> >
> > >> > >>> > --
> > >> > >>> > Kamal
> > >> > >>> >
> > >> > >>>
> > >> > >>
> > >>
> > >
> >
>

Reply via email to