Thanks. +1 LGTM.
On Mon, Sep 16, 2019 at 9:19 AM, Kevin Lu <lu.ke...@berkeley.edu> wrote: > Hi Harsha, > > Thanks for the feedback. I have added *last-poll-seconds-ago* to the KIP > (being consistent with *last-heartbeat-seconds-ago*). > > Regards, > Kevin > > On Sat, Sep 14, 2019 at 9:44 AM Harsha Chintalapani <ka...@harsha.io> > wrote: > > Thanks Kevin for the KIP. Overall LGTM. > On you second point, I think the metric will be really useful to indicate > the perf bottlenecks on user code vs kakfa consumer/broker. > > Thanks, > Harsha > > On Fri, Sep 13, 2019 at 2:41 PM, Kevin Lu <lu.ke...@berkeley.edu> wrote: > > Hi Radai & Jason, > > Thanks for the support and suggestion. > > 1. I think ratio is a good additional metric since the current proposed > metrics are only absolute times which may not be useful in all scenarios. > > I have added this to the KIP: > * - poll-idle-ratio*: The fraction of time the consumer spent waiting for > the user to process records from poll. > > Thoughts on the metric name/description? > > 2. Would it be useful to include a metric measuring the time since poll > was last called? Similar to *heartbeat-last-seconds-ago*, it would be > *poll-last-ms-ago. > *This could be useful if (1) the user has a very high *max.poll.interval. > ms > <http://max.poll.interval.ms>* configured and typically spends a long > time processing, or (2) comparing this metric with others such as > *heartbeat-last-seconds-ago* or something else for gathering data in root > cause analyses (or identifying potential consumer bugs related to poll). > > Regards, > Kevin > > On Fri, Sep 13, 2019 at 10:39 AM Jason Gustafson <ja...@confluent.io> > wrote: > > Hi Kevin, > > This looks reasonable to me. I'd also +1 Radai's suggestion if you're > willing. Something like an idle ratio for the consumer would be helpful. > > Thanks, > Jason > > On Fri, Sep 13, 2019 at 10:08 AM radai <radai.rosenbl...@gmail.com> > wrote: > > while youre at it another metric that we have found to be useful is % > > time > > spent in user code vs time spent in poll() (so time between poll calls / > time inside poll calls) - the higher the % value the more indicative of > user code being the cause of performance bottlenecks. > > On Fri, Sep 13, 2019 at 9:14 AM Kevin Lu <lu.ke...@berkeley.edu> wrote: > > Hi All, > > Happy Friday! Bumping this. Any thoughts? > > Thanks. > > Regards, > Kevin > > On Thu, Sep 5, 2019 at 9:35 AM Kevin Lu <lu.ke...@berkeley.edu> wrote: > > Hi All, > > I'd like to propose a new consumer metric that measures the time > > between > > calls to poll() for use in issues related to hitting > > max.poll.interval.ms > > due to long processing time. > > https://cwiki.apache.org/confluence/display/KAFKA/ > KIP-517%3A+Add+consumer+metric+indicating+time+between+poll+calls > > Please give it a read and let me know what you think. > > Thanks! > > Regards, > Kevin > >