Hi Kevin,

For the computation of the idle ratio, can we make it consistent with the
idle ratios on the broker? Basically we use the following:

idle ratio = idle time / total time

So when the consumer is idle (i.e waiting for records), then the idle ratio
approaches 1. When the application is busy processing, it approaches 0.
Does that make sense?

Thanks,
Jason


On Tue, Sep 17, 2019 at 7:26 PM Satish Duggana <satish.dugg...@gmail.com>
wrote:

> Hi Kevin,
> Thanks for adding useful metrics with the KIP.
>
> On Wed, 18 Sep, 2019, 1:49 AM Kevin Lu, <lu.ke...@berkeley.edu> wrote:
>
> > Hi Manikumar,
> >
> > Thanks for the support.
> >
> > Since we have added a couple additional metrics, I have renamed the KIP
> > title to reflect the content better:  KIP-517: Add consumer metrics to
> > observe user poll behavior
> > <
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-517%3A+Add+consumer+metrics+to+observe+user+poll+behavior
> > >
> >
> > Regards,
> > Kevin
> >
> > On Tue, Sep 17, 2019 at 11:07 AM Manikumar <manikumar.re...@gmail.com>
> > wrote:
> >
> > > Hi Kevin,
> > >
> > > Thanks for the KIP.  LGTM. This will be useful.
> > >
> > > Thanks,
> > >
> > > On Mon, Sep 16, 2019 at 10:17 PM Harsha Chintalapani <ka...@harsha.io>
> > > wrote:
> > >
> > > > Thanks. +1 LGTM.
> > > >
> > > >
> > > > On Mon, Sep 16, 2019 at 9:19 AM, Kevin Lu <lu.ke...@berkeley.edu>
> > wrote:
> > > >
> > > > > Hi Harsha,
> > > > >
> > > > > Thanks for the feedback. I have added *last-poll-seconds-ago* to
> the
> > > KIP
> > > > > (being consistent with *last-heartbeat-seconds-ago*).
> > > > >
> > > > > Regards,
> > > > > Kevin
> > > > >
> > > > > On Sat, Sep 14, 2019 at 9:44 AM Harsha Chintalapani <
> ka...@harsha.io
> > >
> > > > > wrote:
> > > > >
> > > > > Thanks Kevin for the KIP. Overall LGTM.
> > > > > On you second point, I think the metric will be really useful to
> > > indicate
> > > > > the perf bottlenecks on user code vs kakfa consumer/broker.
> > > > >
> > > > > Thanks,
> > > > > Harsha
> > > > >
> > > > > On Fri, Sep 13, 2019 at 2:41 PM, Kevin Lu <lu.ke...@berkeley.edu>
> > > wrote:
> > > > >
> > > > > Hi Radai & Jason,
> > > > >
> > > > > Thanks for the support and suggestion.
> > > > >
> > > > > 1. I think ratio is a good additional metric since the current
> > proposed
> > > > > metrics are only absolute times which may not be useful in all
> > > scenarios.
> > > > >
> > > > > I have added this to the KIP:
> > > > > * - poll-idle-ratio*: The fraction of time the consumer spent
> waiting
> > > for
> > > > > the user to process records from poll.
> > > > >
> > > > > Thoughts on the metric name/description?
> > > > >
> > > > > 2. Would it be useful to include a metric measuring the time since
> > poll
> > > > > was last called? Similar to *heartbeat-last-seconds-ago*, it would
> be
> > > > > *poll-last-ms-ago.
> > > > > *This could be useful if (1) the user has a very high
> > > *max.poll.interval.
> > > > > ms
> > > > > <http://max.poll.interval.ms>* configured and typically spends a
> > long
> > > > > time processing, or (2) comparing this metric with others such as
> > > > > *heartbeat-last-seconds-ago* or something else for gathering data
> in
> > > root
> > > > > cause analyses (or identifying potential consumer bugs related to
> > > poll).
> > > > >
> > > > > Regards,
> > > > > Kevin
> > > > >
> > > > > On Fri, Sep 13, 2019 at 10:39 AM Jason Gustafson <
> ja...@confluent.io
> > >
> > > > > wrote:
> > > > >
> > > > > Hi Kevin,
> > > > >
> > > > > This looks reasonable to me. I'd also +1 Radai's suggestion if
> you're
> > > > > willing. Something like an idle ratio for the consumer would be
> > > helpful.
> > > > >
> > > > > Thanks,
> > > > > Jason
> > > > >
> > > > > On Fri, Sep 13, 2019 at 10:08 AM radai <radai.rosenbl...@gmail.com
> >
> > > > > wrote:
> > > > >
> > > > > while youre at it another metric that we have found to be useful
> is %
> > > > >
> > > > > time
> > > > >
> > > > > spent in user code vs time spent in poll() (so time between poll
> > calls
> > > /
> > > > > time inside poll calls) - the higher the % value the more
> indicative
> > of
> > > > > user code being the cause of performance bottlenecks.
> > > > >
> > > > > On Fri, Sep 13, 2019 at 9:14 AM Kevin Lu <lu.ke...@berkeley.edu>
> > > wrote:
> > > > >
> > > > > Hi All,
> > > > >
> > > > > Happy Friday! Bumping this. Any thoughts?
> > > > >
> > > > > Thanks.
> > > > >
> > > > > Regards,
> > > > > Kevin
> > > > >
> > > > > On Thu, Sep 5, 2019 at 9:35 AM Kevin Lu <lu.ke...@berkeley.edu>
> > wrote:
> > > > >
> > > > > Hi All,
> > > > >
> > > > > I'd like to propose a new consumer metric that measures the time
> > > > >
> > > > > between
> > > > >
> > > > > calls to poll() for use in issues related to hitting
> > > > >
> > > > > max.poll.interval.ms
> > > > >
> > > > > due to long processing time.
> > > > >
> > > > > https://cwiki.apache.org/confluence/display/KAFKA/
> > > > > KIP-517%3A+Add+consumer+metric+indicating+time+between+poll+calls
> > > > >
> > > > > Please give it a read and let me know what you think.
> > > > >
> > > > > Thanks!
> > > > >
> > > > > Regards,
> > > > > Kevin
> > > > >
> > > > >
> > > >
> > >
> >
>

Reply via email to