[ 
https://issues.apache.org/jira/browse/KAFKA-10484?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bruno Cadonna reassigned KAFKA-10484:
-------------------------------------

    Assignee: Bruno Cadonna

> Reduce Metrics Exposed by Streams
> ---------------------------------
>
>                 Key: KAFKA-10484
>                 URL: https://issues.apache.org/jira/browse/KAFKA-10484
>             Project: Kafka
>          Issue Type: Improvement
>          Components: streams
>    Affects Versions: 2.6.0
>            Reporter: Bruno Cadonna
>            Assignee: Bruno Cadonna
>            Priority: Major
>
> In our test cluster metrics are monitored through a monitoring service. We 
> experienced a couple of times that a Kafka Streams client exceeded the limit 
> of 350 metrics of the monitoring service. When the client exceeds the limit, 
> metrics will be truncated which might result in false alerts. For example, in 
> our cluster, we monitor the alive stream threads and trigger an alert if a 
> stream thread dies. It happened that when the client exceeded the 350 metrics 
> limit, the alive stream threads metric was truncated which led to a false 
> alarm.
> The main driver of the high number of metrics are the metrics on task level 
> and below. An example for those metrics are the state store metrics. The 
> number of such metrics per Kafka Streams client is hard to predict since it 
> depends on which tasks are assigned to the client. A stateful task with 5 
> state stores reports 5 times more state store metrics than a stateful with 
> only one state store. Sometimes it is possible to only report the metrics of 
> some state stores. But sometimes this is not an option. For example, if we 
> want to monitor the memory usage of RocksDB per Kafka Streams client, we need 
> to report the memory related metrics of all RocksDB state stores of all tasks 
> assigned to all stream threads of one client.
> One option to reduce the reported metrics is to add a metric that aggregates 
> some state store metrics, e.g., to monitor memory usage, on client-level 
> within Kafka Streams.       



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to