Hi Radai & Jason,

Thanks for the support and suggestion.

1. I think ratio is a good additional metric since the current proposed
metrics are only absolute times which may not be useful in all scenarios.

I have added this to the KIP:
*    - poll-idle-ratio*: The fraction of time the consumer spent waiting
for the user to process records from poll.

Thoughts on the metric name/description?

2. Would it be useful to include a metric measuring the time since poll was
last called? Similar to *heartbeat-last-seconds-ago*, it would be
*poll-last-ms-ago.
*This could be useful if (1) the user has a very high *max.poll.interval.ms
<http://max.poll.interval.ms>* configured and typically spends a long time
processing, or (2) comparing this metric with others such as
*heartbeat-last-seconds-ago* or something else for gathering data in root
cause analyses (or identifying potential consumer bugs related to poll).

Regards,
Kevin


On Fri, Sep 13, 2019 at 10:39 AM Jason Gustafson <ja...@confluent.io> wrote:

> Hi Kevin,
>
> This looks reasonable to me. I'd also +1 Radai's suggestion if you're
> willing. Something like an idle ratio for the consumer would be helpful.
>
> Thanks,
> Jason
>
> On Fri, Sep 13, 2019 at 10:08 AM radai <radai.rosenbl...@gmail.com> wrote:
>
> > while youre at it another metric that we have found to be useful is %
> > time spent in user code vs time spent in poll() (so time between poll
> > calls / time inside poll calls) - the higher the % value the more
> > indicative of user code being the cause of performance bottlenecks.
> >
> > On Fri, Sep 13, 2019 at 9:14 AM Kevin Lu <lu.ke...@berkeley.edu> wrote:
> > >
> > > Hi All,
> > >
> > > Happy Friday! Bumping this. Any thoughts?
> > >
> > > Thanks.
> > >
> > > Regards,
> > > Kevin
> > >
> > > On Thu, Sep 5, 2019 at 9:35 AM Kevin Lu <lu.ke...@berkeley.edu> wrote:
> > >
> > > > Hi All,
> > > >
> > > > I'd like to propose a new consumer metric that measures the time
> > between
> > > > calls to poll() for use in issues related to hitting
> > max.poll.interval.ms
> > > > due to long processing time.
> > > >
> > > >
> > > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-517%3A+Add+consumer+metric+indicating+time+between+poll+calls
> > > >
> > > > Please give it a read and let me know what you think.
> > > >
> > > > Thanks!
> > > >
> > > > Regards,
> > > > Kevin
> > > >
> >
>

Reply via email to