A correction for the previous email:

   - number of alive global threads, INFO
   - number of alive restore threads, INFO


These two metrics are not going to be added in this KIP, since we do not
have restore threads in Kafka Streams yet.


Guozhang


On Thu, Apr 2, 2020 at 2:02 PM Guozhang Wang <wangg...@gmail.com> wrote:

> Hello all,
>
> While implementing the last piece of this KIP for the coming 2.6 release,
> I realized that it is important to cover the following monitoring metrics
> as well so I'd propose adding them as part of KIP-444 too:
>
> Instance-level:
>
>    - number of alive stream threads, INFO
>    - number of alive cleanup threads, INFO
>    - number of alive global threads, INFO
>    - number of alive restore threads, INFO
>
> Monitoring these numbers can help if any threads died unexpectedly while
> the instance is still proceeding.
>
> Thread-level:
>
>    - avg / max number of records polled from the consumer per thread
>    iteration, INFO
>    - avg / max number of records processed by the task manager (i.e.
>    across all tasks) per thread iteration, INFO
>
> Ideally the all polled records can be processed as well within one
> iteration --- if one observed either we polled too few records such that
> thread is mostly idling, or polled too many records that the thread cannot
> keep up, she should go ahead and tune the consumer configs.
>
> Task-level:
>
>    - number of current buffered records at the moment (i.e. it is just a
>    dynamic gauge), DEBUG.
>
> This is a finer grained metric indicating which task's processing cannot
> keep up with the fetching throughput.
>
>
> Please let me know if anyone has any concerns about the proposed metrics.
>
>
> Guozhang
>
>
>
> On Mon, Sep 9, 2019 at 5:17 PM Matthias J. Sax <matth...@confluent.io>
> wrote:
>
>> +1 (binding)
>>
>>
>> -Matthias
>>
>> On 9/5/19 11:47 AM, Guozhang Wang wrote:
>> > +1 from myself.
>> >
>> > I'm now officially closing this voting thread with the following tally:
>> >
>> > binding +1: 3 (Guozhang, Bill, Matthias voted on the DISCUSS thread).
>> > non-binding +1: 2 (Bruno, John).
>> >
>> >
>> > Guozhang
>> >
>> >
>> > On Thu, Aug 22, 2019 at 8:16 AM Bill Bejeck <bbej...@gmail.com> wrote:
>> >
>> >> +1 (binding)
>> >>
>> >> -Bill
>> >>
>> >> On Thu, Aug 22, 2019 at 10:55 AM John Roesler <j...@confluent.io>
>> wrote:
>> >>
>> >>> Hi Guozhang, thanks for cleaning this up.
>> >>>
>> >>> I'm +1 (non-binding)
>> >>>
>> >>> Thanks,
>> >>> -John
>> >>>
>> >>> On Thu, Aug 22, 2019 at 2:26 AM Bruno Cadonna <br...@confluent.io>
>> >> wrote:
>> >>>
>> >>>> Hi Guozhang,
>> >>>>
>> >>>> +1 (non-binding)
>> >>>>
>> >>>> Thank you for driving this!
>> >>>> Bruno
>> >>>>
>> >>>> On Tue, Aug 20, 2019 at 8:29 PM Guozhang Wang <wangg...@gmail.com>
>> >>> wrote:
>> >>>>>
>> >>>>> Hello folks,
>> >>>>>
>> >>>>> I'd like to start a voting thread the following KIP to improve the
>> >>> Kafka
>> >>>>> Streams metrics mechanism to users. This includes 1) renaming
>> changes
>> >>> in
>> >>>>> the public StreamsMetrics utils API, and 2) a major cleanup on the
>> >>>> Streams'
>> >>>>> own built-in metrics hierarchy.
>> >>>>>
>> >>>>> Details can be found here:
>> >>>>>
>> >>>>>
>> >>>>
>> >>>
>> >>
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-444%3A+Augment+metrics+for+Kafka+Streams
>> >>>>>
>> >>>>> I'd love to hear your thoughts and feedbacks. Thanks!
>> >>>>>
>> >>>>> --
>> >>>>> -- Guozhang
>> >>>>
>> >>>
>> >>
>> >
>> >
>>
>>
>
> --
> -- Guozhang
>


-- 
-- Guozhang

Reply via email to