Hi Lucas, Thank you for the feedback! I have addressed your comments:
LB1. Good catch there, I will update the names as needed. LB2. Good catch again! I will update the name to be more consistent. LB3. Thank you for pointing this out, I realized that all metric values will actually be set to 0. I will specifiy this and explain why they will be 0. Nit: This metric is referring to the queue of unsent requests in the NetworkClientDelegate. For the metric descriptions I am trying to not include too much of the implementation details, hence the reason that description is quite short. I cannot think of other ways to describe the metric without going deeper into the implementation, but please do let me know if you have any ideas. Thank you, Brenden On Fri, Jul 12, 2024 at 1:27 PM Lianet M. <liane...@gmail.com> wrote: > Hey Brenden, thanks for the KIP! Great to get more visibility into the new > consumer. > > LM1. +1 on Lucas's suggestion for including the unit in the name, seems > clearer and consistent (I do see several time metrics including ms) > > LM2. What about a new metric for application-event-queue-time-ms. It would > be a complement to the application-event-queue-size you're proposing, and > it will tell us how long the events sit in the queue, waiting to be > processed (from the time the API call adds the event to the queue, to the > time it's processed in the background thread). I find it would be very > interesting. > > LM3. Thinking about the actual usage of > "time-between-network-thread-poll-xxx" metric, I imagine it would be > helpful to know more about what could be impacting it. As I see it, the > network thread cadence could be mainly impacted by: 1- app event processing > (generate requests), 2- network client poll (actual send/receive). For 2, > the new consumer reuses the same component as the legacy one, but 1 is > specific to the new consumer, so what about a metric > for application-event-processing-time-ms (we could consider avg I would > say). It would be the time that the network thread takes to process all > available events on each run. > > Cheers! > Lianet > > On Fri, Jul 12, 2024 at 1:57 PM Lucas Brutschy > <lbruts...@confluent.io.invalid> wrote: > > > Hey Brenden, > > > > thanks for the KIP! These will be great to observe and debug the > > background thread of the new consumer. > > > > LB1. `time-between-network-thread-poll-max` → I see several similar > > metrics including the unit in the metric name (ms or us). We could > > consider this, although it's probably not strictly required. However, > > at least the description should state the unit. Same for the `avg` > > version. > > LB2. `unsent-requests-size` → Naming sounds a bit like it's referring > > to the size of the request. How about `unsent-request-queue-size` or > > `unsent-request-count` or simply `unsent-requests`? > > LB3. "the proposed metrics below will be set to null or 0." → which > > one will be set to null and which ones will be set to 0, and why? > > > > nit: "The current number of unsent requests in the consumer network" → > > Seems to be missing something? > > > > Cheers, > > Lucas > > > > On Fri, Jul 12, 2024 at 7:28 PM Brenden Deluna > > <bdel...@confluent.io.invalid> wrote: > > > > > > Hi Andrew, > > > Thank you for the feedback and your question. > > > > > > AS1. Great idea, I will get that added. > > > > > > AS2. For unsent-events-age-max, age will be calculated once the event > is > > > sent, so you are correct. > > > > > > AS3. I agree, I think that would be a helpful metric to add, thank > you! I > > > will get that added. > > > > > > Please let me know if you have any additional feedback, suggestions, or > > > questions. > > > > > > Thanks, > > > Brenden > > > > > > On Fri, Jul 12, 2024 at 11:45 AM Andrew Schofield < > > andrew_schofi...@live.com> > > > wrote: > > > > > > > Hi Brenden, > > > > Thanks for the KIP. It fills a gap in the metrics for the new > consumer > > > > nicely. > > > > > > > > AS1. If using the CLASSIC protocol, and thus the LegacyKafkaConsumer, > > > > I would expect that the metrics do not exist at all. Maybe say > > something > > > > like > > > > “These metrics are for the new consumer implementation using the > > > > CONSUMER protocol”. > > > > > > > > AS2. For unsent-events-age-max, when is the age calculated? For > > example, > > > > is it calculated at the time that the unsent event is removed from > the > > > > list and sent, or does the metric reflect unsent events which are > still > > > > enqueued? I suspect the former, but thought I’d check. > > > > > > > > AS3. I think that unsent-events-age-avg would also be interesting to > > > > get an idea of how long unsent events tend to sit around before > > sending. > > > > Of course, the question from AS2 would also apply to the average. > > > > > > > > Thanks, > > > > Andrew > > > > > > > > > On 10 Jul 2024, at 17:44, Philip Nee <p...@confluent.io.INVALID> > > wrote: > > > > > > > > > > Hi all, > > > > > > > > > > This is the link to the KIP document. > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-1068%3A+New+JMX+metrics+for+the+new+KafkaConsumer > > > > > > > > > > Any comment is appreciated, > > > > > > > > > > > > > > > On Tue, Jul 9, 2024 at 10:14 AM Brenden Deluna > > > > <bdel...@confluent.io.invalid> > > > > > wrote: > > > > > > > > > >> Hello everyone, > > > > >> > > > > >> I would like to start the discussion thread for KIP-1068. This is > a > > > > >> relatively small KIP, only proposing to add a couple of new > metrics. > > > > >> > > > > >> If you have any suggestions or feedback, let me know, it will be > > much > > > > >> appreciated. > > > > >> > > > > > > > > > > >