Ismael,

Great, that sounds lovely.

I'd like a `Timer` (using yammer metrics parlance) over how long it took to
process the event, so we can get at p99 and max times spent processing
things. Maybe we could even do a log at warning level if event processing
takes over some timeout?

Thanks

Tom

On Thu, Apr 27, 2017 at 3:59 PM, Ismael Juma <ism...@juma.me.uk> wrote:

> Hi Tom,
>
> Yes, the plan is to merge KAFKA-5028 first and then use a lock-free
> approach for the new  metrics. I considered mentioning that in the KIP
> given KAFKA-5120, but didn't in the end. I'll add it to make it clear.
>
> Regarding locks, they are removed by KAFKA-5028, as you say. So, if I
> understand correctly, you are suggesting an event processing rate metric
> with event type as a tag? Onur and Jun, what do you think?
>
> Ismael
>
> On Thu, Apr 27, 2017 at 3:47 PM, Tom Crayford <tcrayf...@heroku.com>
> wrote:
>
> > Hi,
> >
> > We (Heroku) are very excited about this KIP, as we've struggled a bit
> with
> > controller stability recently. Having these additional metrics would be
> > wonderful.
> >
> > I'd like to ensure polling these metrics *doesn't* hold any locks etc,
> > because, as noted in https://issues.apache.org/jira/browse/KAFKA-5120,
> > that
> > lock can be held for quite some time. This may become not an issue as of
> > KAFKA-5028 though.
> >
> > Lastly, I'd love to see some metrics around how long the controller
> spends
> > inside its lock. We've been tracking an issue (
> > https://issues.apache.org/jira/browse/KAFKA-5116) where it can hold the
> > lock for many, many minutes in a zk client listener thread when
> responding
> > to a single request. I'm not sure how that plays into
> > https://issues.apache.org/jira/browse/KAFKA-5028 (which I assume will
> land
> > before this metrics patch), but it feels like there will be equivalent
> > problems ("how long does it spend processing any individual message from
> > the queue, broken down by message type").
> >
> > These are minor improvements though, the addition of more metrics to the
> > controller is already going to be very helpful.
> >
> > Thanks
> >
> > Tom Crayford
> > Heroku Kafka
> >
> > On Thu, Apr 27, 2017 at 3:10 PM, Ismael Juma <ism...@juma.me.uk> wrote:
> >
> > > Hi all,
> > >
> > > We've posted "KIP-143: Controller Health Metrics" for discussion:
> > >
> > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > > 143%3A+Controller+Health+Metrics
> > >
> > > Please take a look. Your feedback is appreciated.
> > >
> > > Thanks,
> > > Ismael
> > >
> >
>

Reply via email to