Hi Brenden, Thanks for the updates. AS4. I see that you’ve added `.ms` to a bunch of the metrics reflecting the fact that they’re measured in milliseconds. However, I observe that most metrics in Kafka that are measured in milliseconds, with some exceptions in Kafka Connect and MirrorMaker do not follow this convention. I would tend to err on the side of consistency with the existing metrics and not use `.ms`. However, that’s just my opinion, so I’d be interested to know what other reviewers of the KIP think.
Thanks, Andrew > On 12 Jul 2024, at 20:11, Brenden Deluna <bdel...@confluent.io.INVALID> wrote: > > Hey Lianet, > > Thank you for your suggestions and feedback! > > > LM1. This has now been addressed. > > > LM2. I think that would be a valuable addition to the current set of > metrics, I will get that added. > > > LM3. Again great idea, that would certainly be helpful. Will add that as > well. > > > Let me know if you have any more suggestions! > > > Thanks, > > Brenden > > On Fri, Jul 12, 2024 at 2:11 PM Brenden Deluna <bdel...@confluent.io> wrote: > >> Hi Lucas, >> >> Thank you for the feedback! I have addressed your comments: >> >> >> LB1. Good catch there, I will update the names as needed. >> >> >> LB2. Good catch again! I will update the name to be more consistent. >> >> >> LB3. Thank you for pointing this out, I realized that all metric values >> will actually be set to 0. I will specifiy this and explain why they will >> be 0. >> >> >> Nit: This metric is referring to the queue of unsent requests in the >> NetworkClientDelegate. For the metric descriptions I am trying to not >> include too much of the implementation details, hence the reason that >> description is quite short. I cannot think of other ways to describe the >> metric without going deeper into the implementation, but please do let me >> know if you have any ideas. >> >> >> Thank you, >> >> Brenden >> >> On Fri, Jul 12, 2024 at 1:27 PM Lianet M. <liane...@gmail.com> wrote: >> >>> Hey Brenden, thanks for the KIP! Great to get more visibility into the new >>> consumer. >>> >>> LM1. +1 on Lucas's suggestion for including the unit in the name, seems >>> clearer and consistent (I do see several time metrics including ms) >>> >>> LM2. What about a new metric for application-event-queue-time-ms. It would >>> be a complement to the application-event-queue-size you're proposing, and >>> it will tell us how long the events sit in the queue, waiting to be >>> processed (from the time the API call adds the event to the queue, to the >>> time it's processed in the background thread). I find it would be very >>> interesting. >>> >>> LM3. Thinking about the actual usage of >>> "time-between-network-thread-poll-xxx" metric, I imagine it would be >>> helpful to know more about what could be impacting it. As I see it, the >>> network thread cadence could be mainly impacted by: 1- app event >>> processing >>> (generate requests), 2- network client poll (actual send/receive). For 2, >>> the new consumer reuses the same component as the legacy one, but 1 is >>> specific to the new consumer, so what about a metric >>> for application-event-processing-time-ms (we could consider avg I would >>> say). It would be the time that the network thread takes to process all >>> available events on each run. >>> >>> Cheers! >>> Lianet >>> >>> On Fri, Jul 12, 2024 at 1:57 PM Lucas Brutschy >>> <lbruts...@confluent.io.invalid> wrote: >>> >>>> Hey Brenden, >>>> >>>> thanks for the KIP! These will be great to observe and debug the >>>> background thread of the new consumer. >>>> >>>> LB1. `time-between-network-thread-poll-max` → I see several similar >>>> metrics including the unit in the metric name (ms or us). We could >>>> consider this, although it's probably not strictly required. However, >>>> at least the description should state the unit. Same for the `avg` >>>> version. >>>> LB2. `unsent-requests-size` → Naming sounds a bit like it's referring >>>> to the size of the request. How about `unsent-request-queue-size` or >>>> `unsent-request-count` or simply `unsent-requests`? >>>> LB3. "the proposed metrics below will be set to null or 0." → which >>>> one will be set to null and which ones will be set to 0, and why? >>>> >>>> nit: "The current number of unsent requests in the consumer network" → >>>> Seems to be missing something? >>>> >>>> Cheers, >>>> Lucas >>>> >>>> On Fri, Jul 12, 2024 at 7:28 PM Brenden Deluna >>>> <bdel...@confluent.io.invalid> wrote: >>>>> >>>>> Hi Andrew, >>>>> Thank you for the feedback and your question. >>>>> >>>>> AS1. Great idea, I will get that added. >>>>> >>>>> AS2. For unsent-events-age-max, age will be calculated once the event >>> is >>>>> sent, so you are correct. >>>>> >>>>> AS3. I agree, I think that would be a helpful metric to add, thank >>> you! I >>>>> will get that added. >>>>> >>>>> Please let me know if you have any additional feedback, suggestions, >>> or >>>>> questions. >>>>> >>>>> Thanks, >>>>> Brenden >>>>> >>>>> On Fri, Jul 12, 2024 at 11:45 AM Andrew Schofield < >>>> andrew_schofi...@live.com> >>>>> wrote: >>>>> >>>>>> Hi Brenden, >>>>>> Thanks for the KIP. It fills a gap in the metrics for the new >>> consumer >>>>>> nicely. >>>>>> >>>>>> AS1. If using the CLASSIC protocol, and thus the >>> LegacyKafkaConsumer, >>>>>> I would expect that the metrics do not exist at all. Maybe say >>>> something >>>>>> like >>>>>> “These metrics are for the new consumer implementation using the >>>>>> CONSUMER protocol”. >>>>>> >>>>>> AS2. For unsent-events-age-max, when is the age calculated? For >>>> example, >>>>>> is it calculated at the time that the unsent event is removed from >>> the >>>>>> list and sent, or does the metric reflect unsent events which are >>> still >>>>>> enqueued? I suspect the former, but thought I’d check. >>>>>> >>>>>> AS3. I think that unsent-events-age-avg would also be interesting to >>>>>> get an idea of how long unsent events tend to sit around before >>>> sending. >>>>>> Of course, the question from AS2 would also apply to the average. >>>>>> >>>>>> Thanks, >>>>>> Andrew >>>>>> >>>>>>> On 10 Jul 2024, at 17:44, Philip Nee <p...@confluent.io.INVALID> >>>> wrote: >>>>>>> >>>>>>> Hi all, >>>>>>> >>>>>>> This is the link to the KIP document. >>>>>>> >>>>>> >>>> >>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1068%3A+New+JMX+metrics+for+the+new+KafkaConsumer >>>>>>> >>>>>>> Any comment is appreciated, >>>>>>> >>>>>>> >>>>>>> On Tue, Jul 9, 2024 at 10:14 AM Brenden Deluna >>>>>> <bdel...@confluent.io.invalid> >>>>>>> wrote: >>>>>>> >>>>>>>> Hello everyone, >>>>>>>> >>>>>>>> I would like to start the discussion thread for KIP-1068. This >>> is a >>>>>>>> relatively small KIP, only proposing to add a couple of new >>> metrics. >>>>>>>> >>>>>>>> If you have any suggestions or feedback, let me know, it will be >>>> much >>>>>>>> appreciated.