Hello all, Now that 3.6 has been released, I would like to bring back attention to the following KIP for adding metrics to tiered storage targeting 3.7 - https://cwiki.apache.org/confluence/display/KAFKA/KIP-963%3A+Add+more+metrics+to+Tiered+Storage .
Let me know your thoughts about the list of metrics and their granularity! Best, Christo On Fri, 13 Oct 2023 at 10:14, Christo Lolov <christolo...@gmail.com> wrote: > Heya Gantigmaa, > > Apologies for the (very) late reply! > > Now that 3.6 has been released and reviewers have a bit more time I will > be picking up this KIP again. I am more than happy to add useful new > metrics to the KIP, I would just ask for a couple of days to review your > pull request and I will come back to you. > > Best, > Christo > > On Mon, 25 Sept 2023 at 10:49, Gantigmaa Selenge <gsele...@redhat.com> > wrote: > >> Hi Christo, >> >> Thank you for writing the KIP. >> >> I recently raised a PR to add metrics for tracking remote segment >> deletions >> (https://github.com/apache/kafka/pull/14375) but realised those metrics >> were not mentioned in the original KIP-405 or KIP-930. Do you think these >> would make sense to be added to this KIP and get included in the >> discussion? >> >> Regards, >> Gantigmaa >> >> On Wed, Aug 9, 2023 at 1:53 PM Christo Lolov <christolo...@gmail.com> >> wrote: >> >> > Heya Kamal, >> > >> > Thank you for going through the KIP and for the question! >> > >> > I have been thinking about this and as an operator I might find it the >> most >> > useful to know all three of them actually. >> > >> > I would find knowing the size in bytes useful to determine how much >> disk I >> > might need to add temporarily to compensate for the slowdown. >> > I would find knowing the number of records useful, because using the >> > MessagesInPerSec metric I would be able to determine how old the records >> > which are facing problems are. >> > I would find knowing the number of segments useful because I would be >> able >> > to correlate this with whether I need to change >> > *remote.log.manager.task.interval.ms >> > <http://remote.log.manager.task.interval.ms> *to a lower or higher >> value. >> > >> > What are your thoughts on the above? Would you find some of them more >> > useful than others? >> > >> > Best, >> > Christo >> > >> > On Tue, 8 Aug 2023 at 16:43, Kamal Chandraprakash < >> > kamal.chandraprak...@gmail.com> wrote: >> > >> > > Hi Christo, >> > > >> > > Thanks for the KIP! >> > > >> > > The proposed tiered storage metrics are useful. The unit mentioned in >> the >> > > KIP is the number of records. >> > > Each topic can have varying amounts of records in a segment depending >> on >> > > the record size. >> > > >> > > Do you think having the tier-lag by number of segments (or) size of >> > > segments in bytes will be useful >> > > to the operator? >> > > >> > > Thanks, >> > > Kamal >> > > >> > > On Tue, Aug 8, 2023 at 8:56 PM Christo Lolov <christolo...@gmail.com> >> > > wrote: >> > > >> > > > Hello all! >> > > > >> > > > I would like to start a discussion for KIP-963: Upload and delete >> lag >> > > > metrics in Tiered Storage ( >> > https://cwiki.apache.org/confluence/x/sZGzDw >> > > ). >> > > > >> > > > The purpose of this KIP is to introduce a couple of metrics to track >> > lag >> > > > with respect to remote storage from the point of view of Kafka. >> > > > >> > > > Thanks in advance for leaving a review! >> > > > >> > > > Best, >> > > > Christo >> > > > >> > > >> > >> >