Good evening. I have read through section of monitoring. I tried to map
each section to corresponding JMX attribute. I will appreciate if you
answer a few questions bellow.

Thanks so much in advance,
Vadim

    What this JMX
"kafka.controller":type="KafkaController",name="ActiveControllerCount" for?

    The rate of data in and out of the cluster and the number of messages
written
   Which jmx attributes should I monitor? Since I should alert on this What
are acceptable changes? What are not?
    The log flush rate and the time taken to flush the log
    "kafka.log":type="LogFlushStats",name="LogFlushRateAndTimeMs"
Which attribute I should be watching and what acceptable deviation change
before I should alert
    The number of partitions that have replicas that are down or have
fallen behind and are underreplicated.
   Is this the JMX
"kafka.cluster":type="Partition",name="buypets-0-UnderReplicated" that will
show replicas that are down?

    Unclean leader elections. This shouldn't happen.

 "kafka.controller":type="ControllerStats",name="UncleanLeaderElectionsPerSec".
I assume that should always be 0 and if its not 0 we have problem.
    Number of partitions each node is the leader for.
   Which JMX attribute(s) monitors this?
    Leader elections: we track each time this happens and how long it took:

"kafka.controller":type="ControllerStats",name="LeaderElectionRateAndTimeMs"
    Any changes to the ISR
    Which JMX attribute I should monitor for this? Should I alert on this?
What are reasonable changes? Which are not?
    The number of produce requests waiting on replication to report back
   Which JMX attribute I should monitor for this? Should I alert on this?
What are reasonable changes? Which are not?
    The number of fetch requests waiting on data to arrive
   Which JMX attribute I should monitor for this? Should I alert on this?
What are reasonable changes? Which are not?

Reply via email to