Christo, We have localTimeMs, remoteTimeMs, and totalTimeMs as part of the FetchConsumer request metric.
kafka.network:type=RequestMetrics,name={LocalTimeMs|RemoteTimeMs|TotalTimeMs},request={Produce|FetchConsumer|FetchFollower} RemoteTimeMs refers to the amount of time spent in the purgatory for normal fetch requests and amount of time spent in reading the remote data for remote-fetch requests. Do we want to have a separate `TieredStorageTimeMs` to capture the time spent in remote-read requests? With per-broker level timer metrics combined with the request level metrics, the user will have sufficient information. Metric name = kafka.log.remote:type=RemoteLogManager,name=RemoteLogReaderFetchRateAndTimeMs -- Kamal On Mon, Apr 29, 2024 at 1:38 PM Christo Lolov <christolo...@gmail.com> wrote: > Heya! > > Is it difficult to instead add the metric at > kafka.network:type=RequestMetrics,name=TieredStorageMs (or some other > name=*)? Alternatively, if it is difficult to add it there, is it possible > to add 2 metrics, one at the RequestMetrics level (even if it is > total-time-ms - (all other times)) and one at what you are proposing? As an > operator I would find it strange to not see the metric in the > RequestMetrics. > > Your thoughts? > > Best, > Christo > > On Sun, 28 Apr 2024 at 10:52, Kamal Chandraprakash < > kamal.chandraprak...@gmail.com> wrote: > > > Christo, > > > > Updated the KIP with the remote fetch latency metric. Please take another > > look! > > > > -- > > Kamal > > > > On Sun, Apr 28, 2024 at 12:23 PM Kamal Chandraprakash < > > kamal.chandraprak...@gmail.com> wrote: > > > > > Hi Federico, > > > > > > Thanks for the suggestion! Updated the config name to " > > > remote.fetch.max.wait.ms". > > > > > > Christo, > > > > > > Good point. We don't have the remote-read latency metrics to measure > the > > > performance of the remote read requests. I'll update the KIP to emit > this > > > metric. > > > > > > -- > > > Kamal > > > > > > > > > On Sat, Apr 27, 2024 at 4:03 PM Federico Valeri <fedeval...@gmail.com> > > > wrote: > > > > > >> Hi Kamal, it looks like all TS configurations starts with "remote." > > >> prefix, so I was wondering if we should name it > > >> "remote.fetch.max.wait.ms". > > >> > > >> On Fri, Apr 26, 2024 at 7:07 PM Kamal Chandraprakash > > >> <kamal.chandraprak...@gmail.com> wrote: > > >> > > > >> > Hi all, > > >> > > > >> > If there are no more comments, I'll start a vote thread by tomorrow. > > >> > Please review the KIP. > > >> > > > >> > Thanks, > > >> > Kamal > > >> > > > >> > On Sat, Mar 30, 2024 at 11:08 PM Kamal Chandraprakash < > > >> > kamal.chandraprak...@gmail.com> wrote: > > >> > > > >> > > Hi all, > > >> > > > > >> > > Bumping the thread. Please review this KIP. Thanks! > > >> > > > > >> > > On Thu, Feb 1, 2024 at 9:11 PM Kamal Chandraprakash < > > >> > > kamal.chandraprak...@gmail.com> wrote: > > >> > > > > >> > >> Hi Jorge, > > >> > >> > > >> > >> Thanks for the review! Added your suggestions to the KIP. PTAL. > > >> > >> > > >> > >> The `fetch.max.wait.ms` config will be also applicable for > topics > > >> > >> enabled with remote storage. > > >> > >> Updated the description to: > > >> > >> > > >> > >> ``` > > >> > >> The maximum amount of time the server will block before answering > > the > > >> > >> fetch request > > >> > >> when it is reading near to the tail of the partition > > >> (high-watermark) and > > >> > >> there isn't > > >> > >> sufficient data to immediately satisfy the requirement given by > > >> > >> fetch.min.bytes. > > >> > >> ``` > > >> > >> > > >> > >> -- > > >> > >> Kamal > > >> > >> > > >> > >> On Thu, Feb 1, 2024 at 12:12 AM Jorge Esteban Quilcate Otoya < > > >> > >> quilcate.jo...@gmail.com> wrote: > > >> > >> > > >> > >>> Hi Kamal, > > >> > >>> > > >> > >>> Thanks for this KIP! It should help to solve one of the main > > issues > > >> with > > >> > >>> tiered storage at the moment that is dealing with individual > > >> consumer > > >> > >>> configurations to avoid flooding logs with interrupted > exceptions. > > >> > >>> > > >> > >>> One of the topics discussed in [1][2] was on the semantics of ` > > >> > >>> fetch.max.wait.ms` and how it's affected by remote storage. > > Should > > >> we > > >> > >>> consider within this KIP the update of `fetch.max.wail.ms` docs > > to > > >> > >>> clarify > > >> > >>> it only applies to local storage? > > >> > >>> > > >> > >>> Otherwise, LGTM -- looking forward to see this KIP adopted. > > >> > >>> > > >> > >>> [1] https://issues.apache.org/jira/browse/KAFKA-15776 > > >> > >>> [2] > > >> https://github.com/apache/kafka/pull/14778#issuecomment-1820588080 > > >> > >>> > > >> > >>> On Tue, 30 Jan 2024 at 01:01, Kamal Chandraprakash < > > >> > >>> kamal.chandraprak...@gmail.com> wrote: > > >> > >>> > > >> > >>> > Hi all, > > >> > >>> > > > >> > >>> > I have opened a KIP-1018 > > >> > >>> > < > > >> > >>> > > > >> > >>> > > >> > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-1018%3A+Introduce+max+remote+fetch+timeout+config+for+DelayedRemoteFetch+requests > > >> > >>> > > > > >> > >>> > to introduce dynamic max-remote-fetch-timeout broker config to > > >> give > > >> > >>> more > > >> > >>> > control to the operator. > > >> > >>> > > > >> > >>> > > > >> > >>> > > > >> > >>> > > >> > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-1018%3A+Introduce+max+remote+fetch+timeout+config+for+DelayedRemoteFetch+requests > > >> > >>> > > > >> > >>> > Let me know if you have any feedback or suggestions. > > >> > >>> > > > >> > >>> > -- > > >> > >>> > Kamal > > >> > >>> > > > >> > >>> > > >> > >> > > >> > > > > > >