[jira] [Commented] (KAFKA-3811) Introduce Kafka Streams metrics recording levels
[ https://issues.apache.org/jira/browse/KAFKA-3811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15722752#comment-15722752 ] Eno Thereska commented on KAFKA-3811: - This JIRA will be resolved as part of https://issues.apache.org/jira/browse/KAFKA-3715 > Introduce Kafka Streams metrics recording levels > > > Key: KAFKA-3811 > URL: https://issues.apache.org/jira/browse/KAFKA-3811 > Project: Kafka > Issue Type: Improvement > Components: streams >Reporter: Greg Fodor >Assignee: Eno Thereska > Attachments: Muon-Snapshot.zip, Muon-latency.zip, screenshot-1.png, > screenshot-latency.png > > > Follow-up from the discussions here: > https://github.com/apache/kafka/pull/1447 > https://issues.apache.org/jira/browse/KAFKA-3769 > The proposal is to introduce configuration to control the granularity/volumes > of metrics emitted by Kafka Streams jobs, since the per-record level metrics > introduce non-trivial overhead and are possibly less useful once a job has > been optimized. > Proposal from guozhangwang: > level0 (stream thread global): per-record process / punctuate latency, commit > latency, poll latency, etc > level1 (per processor node, and per state store): IO latency, per-record .. > latency, forward throughput, etc. > And by default we only turn on level0. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-3811) Introduce Kafka Streams metrics recording levels
[ https://issues.apache.org/jira/browse/KAFKA-3811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15326855#comment-15326855 ] aarti gupta commented on KAFKA-3811: Yes indeed, as per http://www.brendangregg.com/blog/2014-06-09/java-cpu-sampling-using-hprof.html Tried to do something similar using simpleBenchmark (on a different fix) and yourkit profiler as [~gfodor] mentions below, here https://github.com/apache/kafka/pull/1446#issuecomment-225488213, but not convinced that the results mean anything conclusive. thoughts/suggestions on a repeatable/consistent toolset Like the idea of 3>>Add a general purpose feature to the metrics library and use it across the producer, consumer, and streams. But before we refactor the existing library want a reproducible test, any suggestions on a sceanrio other than SimpleBenchmark? > Introduce Kafka Streams metrics recording levels > > > Key: KAFKA-3811 > URL: https://issues.apache.org/jira/browse/KAFKA-3811 > Project: Kafka > Issue Type: Improvement > Components: streams >Reporter: Greg Fodor >Assignee: aarti gupta > Attachments: Muon-Snapshot.zip, Muon-latency.zip, screenshot-1.png, > screenshot-latency.png > > > Follow-up from the discussions here: > https://github.com/apache/kafka/pull/1447 > https://issues.apache.org/jira/browse/KAFKA-3769 > The proposal is to introduce configuration to control the granularity/volumes > of metrics emitted by Kafka Streams jobs, since the per-record level metrics > introduce non-trivial overhead and are possibly less useful once a job has > been optimized. > Proposal from guozhangwang: > level0 (stream thread global): per-record process / punctuate latency, commit > latency, poll latency, etc > level1 (per processor node, and per state store): IO latency, per-record .. > latency, forward throughput, etc. > And by default we only turn on level0. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-3811) Introduce Kafka Streams metrics recording levels
[ https://issues.apache.org/jira/browse/KAFKA-3811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15324869#comment-15324869 ] Greg Fodor commented on KAFKA-3811: --- Also, I've attached a screenshot + snapshot of a second run where I started sending data deeper in the pipeline which started to cause the latency metrics to take up a few % of time since we're using state stores. To me I guess a lot of this looks like lock contention mostly. > Introduce Kafka Streams metrics recording levels > > > Key: KAFKA-3811 > URL: https://issues.apache.org/jira/browse/KAFKA-3811 > Project: Kafka > Issue Type: Improvement > Components: streams >Reporter: Greg Fodor >Assignee: aarti gupta > Attachments: Muon-Snapshot.zip, Muon-latency.zip, screenshot-1.png, > screenshot-latency.png > > > Follow-up from the discussions here: > https://github.com/apache/kafka/pull/1447 > https://issues.apache.org/jira/browse/KAFKA-3769 > The proposal is to introduce configuration to control the granularity/volumes > of metrics emitted by Kafka Streams jobs, since the per-record level metrics > introduce non-trivial overhead and are possibly less useful once a job has > been optimized. > Proposal from guozhangwang: > level0 (stream thread global): per-record process / punctuate latency, commit > latency, poll latency, etc > level1 (per processor node, and per state store): IO latency, per-record .. > latency, forward throughput, etc. > And by default we only turn on level0. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-3811) Introduce Kafka Streams metrics recording levels
[ https://issues.apache.org/jira/browse/KAFKA-3811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15324847#comment-15324847 ] Greg Fodor commented on KAFKA-3811: --- I've also attached a screenshot of YourKit of the relevant call stacks > Introduce Kafka Streams metrics recording levels > > > Key: KAFKA-3811 > URL: https://issues.apache.org/jira/browse/KAFKA-3811 > Project: Kafka > Issue Type: Improvement > Components: streams >Reporter: Greg Fodor >Assignee: aarti gupta > Attachments: Muon-Snapshot.zip, screenshot-1.png > > > Follow-up from the discussions here: > https://github.com/apache/kafka/pull/1447 > https://issues.apache.org/jira/browse/KAFKA-3769 > The proposal is to introduce configuration to control the granularity/volumes > of metrics emitted by Kafka Streams jobs, since the per-record level metrics > introduce non-trivial overhead and are possibly less useful once a job has > been optimized. > Proposal from guozhangwang: > level0 (stream thread global): per-record process / punctuate latency, commit > latency, poll latency, etc > level1 (per processor node, and per state store): IO latency, per-record .. > latency, forward throughput, etc. > And by default we only turn on level0. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-3811) Introduce Kafka Streams metrics recording levels
[ https://issues.apache.org/jira/browse/KAFKA-3811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15324838#comment-15324838 ] Greg Fodor commented on KAFKA-3811: --- Hey [~aartigupta], I ran an attached yourkit profiler to one of our jobs running dark against production data. The job has 200-300 topic-partition pairs and generally discards most messages early in the pipeline, and was processing a few thousand tps from the top level topics. Unfortunately since this issue came up we implemented changes to reduce the amount of data running through the system (discarding it earlier) so we didn't have to worry about this performance problem. In my tests a majority of the CPU time of the job was spent inside of the code walking and emitting to the Sensors for the per-message process metrics and the per-k/v read/write latency metrics. I also found 6-7% of the time was spent in the fetcher metrics which was addressed here: https://github.com/apache/kafka/pull/1464. Good news: I managed to find the snapshot data :) I will attach it here. The majority of the time is *not* the milliseconds() call but the actual (synchronized?) walk of Sensors in Sensor.record. > Introduce Kafka Streams metrics recording levels > > > Key: KAFKA-3811 > URL: https://issues.apache.org/jira/browse/KAFKA-3811 > Project: Kafka > Issue Type: Improvement > Components: streams >Reporter: Greg Fodor >Assignee: aarti gupta > > Follow-up from the discussions here: > https://github.com/apache/kafka/pull/1447 > https://issues.apache.org/jira/browse/KAFKA-3769 > The proposal is to introduce configuration to control the granularity/volumes > of metrics emitted by Kafka Streams jobs, since the per-record level metrics > introduce non-trivial overhead and are possibly less useful once a job has > been optimized. > Proposal from guozhangwang: > level0 (stream thread global): per-record process / punctuate latency, commit > latency, poll latency, etc > level1 (per processor node, and per state store): IO latency, per-record .. > latency, forward throughput, etc. > And by default we only turn on level0. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-3811) Introduce Kafka Streams metrics recording levels
[ https://issues.apache.org/jira/browse/KAFKA-3811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15324760#comment-15324760 ] aarti gupta commented on KAFKA-3811: >>Make the metrics lower overhead (this is an issue in the producer too). [~jkreps] can you share details on how these measurements were done. > Introduce Kafka Streams metrics recording levels > > > Key: KAFKA-3811 > URL: https://issues.apache.org/jira/browse/KAFKA-3811 > Project: Kafka > Issue Type: Improvement > Components: streams >Reporter: Greg Fodor >Assignee: aarti gupta > > Follow-up from the discussions here: > https://github.com/apache/kafka/pull/1447 > https://issues.apache.org/jira/browse/KAFKA-3769 > The proposal is to introduce configuration to control the granularity/volumes > of metrics emitted by Kafka Streams jobs, since the per-record level metrics > introduce non-trivial overhead and are possibly less useful once a job has > been optimized. > Proposal from guozhangwang: > level0 (stream thread global): per-record process / punctuate latency, commit > latency, poll latency, etc > level1 (per processor node, and per state store): IO latency, per-record .. > latency, forward throughput, etc. > And by default we only turn on level0. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-3811) Introduce Kafka Streams metrics recording levels
[ https://issues.apache.org/jira/browse/KAFKA-3811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15324755#comment-15324755 ] aarti gupta commented on KAFKA-3811: [~gfodor] Can you please share and outline the profiling and analysis you did around metrics overhead. I just ran the 'WordProcessorDemo' with actual data (continuously being published) on the input stream and profiled the streams example using both Java mission control Flight recorder and Yourkit profiler (evaluation version), but see a 5 % CPU overhead for the entire process. How are you isolating the time taken to stamp metrics? > Introduce Kafka Streams metrics recording levels > > > Key: KAFKA-3811 > URL: https://issues.apache.org/jira/browse/KAFKA-3811 > Project: Kafka > Issue Type: Improvement > Components: streams >Reporter: Greg Fodor >Assignee: aarti gupta > > Follow-up from the discussions here: > https://github.com/apache/kafka/pull/1447 > https://issues.apache.org/jira/browse/KAFKA-3769 > The proposal is to introduce configuration to control the granularity/volumes > of metrics emitted by Kafka Streams jobs, since the per-record level metrics > introduce non-trivial overhead and are possibly less useful once a job has > been optimized. > Proposal from guozhangwang: > level0 (stream thread global): per-record process / punctuate latency, commit > latency, poll latency, etc > level1 (per processor node, and per state store): IO latency, per-record .. > latency, forward throughput, etc. > And by default we only turn on level0. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-3811) Introduce Kafka Streams metrics recording levels
[ https://issues.apache.org/jira/browse/KAFKA-3811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15324669#comment-15324669 ] Jay Kreps commented on KAFKA-3811: -- I'm not wild about introducing these levels in an ad hoc way in Kafka Streams. A couple of other options: 1. Make the metrics lower overhead (this is an issue in the producer too). 2. Optimize the usage of metrics in the consumer and streams (i.e. in the producer we increment metrics in batch to avoid locking on each message). 3. Add a general purpose feature to the metrics library and use it across the producer, consumer, and streams. For (3) here is what I am thinking, I think what you are describing is a bit like log4j where there is DEBUG level logging that is cheap or free when you haven't turned it on. Basically what I'm imagining is that there would be a new attribute in org.apache.kafka.common.metrics.Sensor that is something like DEBUG/INFO and then there is a global level that is set (and perhaps can be changed via JMX) and the locking and update of the sensor only happens if the appropriate level or lower is active. Then we would categorize existing metrics with this category through the producer, consumer, and streams. (Arguably this should be at the metric level rather than the sensor level but I'm not sure if it's possible to make that cheap--if so that might be better). > Introduce Kafka Streams metrics recording levels > > > Key: KAFKA-3811 > URL: https://issues.apache.org/jira/browse/KAFKA-3811 > Project: Kafka > Issue Type: Improvement > Components: streams >Reporter: Greg Fodor >Assignee: aarti gupta > > Follow-up from the discussions here: > https://github.com/apache/kafka/pull/1447 > https://issues.apache.org/jira/browse/KAFKA-3769 > The proposal is to introduce configuration to control the granularity/volumes > of metrics emitted by Kafka Streams jobs, since the per-record level metrics > introduce non-trivial overhead and are possibly less useful once a job has > been optimized. > Proposal from guozhangwang: > level0 (stream thread global): per-record process / punctuate latency, commit > latency, poll latency, etc > level1 (per processor node, and per state store): IO latency, per-record .. > latency, forward throughput, etc. > And by default we only turn on level0. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-3811) Introduce Kafka Streams metrics recording levels
[ https://issues.apache.org/jira/browse/KAFKA-3811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15323616#comment-15323616 ] Guozhang Wang commented on KAFKA-3811: -- Yeah I think they are related. Feel free to re-assign to yourself. One thing that I'm working on is to optimize some metrics overhead due to frequent calls to `time.milliseconds`, some details are discussed in https://github.com/apache/kafka/pull/1447. So we may have some overlaps in the code base, and hence some rebasing may be needed along the process. Just FYI. > Introduce Kafka Streams metrics recording levels > > > Key: KAFKA-3811 > URL: https://issues.apache.org/jira/browse/KAFKA-3811 > Project: Kafka > Issue Type: Improvement > Components: streams >Reporter: Greg Fodor >Assignee: Guozhang Wang > > Follow-up from the discussions here: > https://github.com/apache/kafka/pull/1447 > https://issues.apache.org/jira/browse/KAFKA-3769 > The proposal is to introduce configuration to control the granularity/volumes > of metrics emitted by Kafka Streams jobs, since the per-record level metrics > introduce non-trivial overhead and are possibly less useful once a job has > been optimized. > Proposal from guozhangwang: > level0 (stream thread global): per-record process / punctuate latency, commit > latency, poll latency, etc > level1 (per processor node, and per state store): IO latency, per-record .. > latency, forward throughput, etc. > And by default we only turn on level0. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-3811) Introduce Kafka Streams metrics recording levels
[ https://issues.apache.org/jira/browse/KAFKA-3811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15323347#comment-15323347 ] aarti gupta commented on KAFKA-3811: Wondering how this relates to https://issues.apache.org/jira/browse/KAFKA-3715, https://github.com/apache/kafka/pull/1446 Do user defined metrics have a level? [~guozhang] would you like me to take this up? (if you think it overlaps with https://issues.apache.org/jira/browse/KAFKA-3715) > Introduce Kafka Streams metrics recording levels > > > Key: KAFKA-3811 > URL: https://issues.apache.org/jira/browse/KAFKA-3811 > Project: Kafka > Issue Type: Improvement > Components: streams >Reporter: Greg Fodor >Assignee: Guozhang Wang > > Follow-up from the discussions here: > https://github.com/apache/kafka/pull/1447 > https://issues.apache.org/jira/browse/KAFKA-3769 > The proposal is to introduce configuration to control the granularity/volumes > of metrics emitted by Kafka Streams jobs, since the per-record level metrics > introduce non-trivial overhead and are possibly less useful once a job has > been optimized. > Proposal from guozhangwang: > level0 (stream thread global): per-record process / punctuate latency, commit > latency, poll latency, etc > level1 (per processor node, and per state store): IO latency, per-record .. > latency, forward throughput, etc. > And by default we only turn on level0. -- This message was sent by Atlassian JIRA (v6.3.4#6332)