[jira] [Commented] (KAFKA-7660) Stream Metrics - Memory Analysis

Patrik Kleindl (JIRA) Fri, 23 Nov 2018 04:19:17 -0800


    [ 
https://issues.apache.org/jira/browse/KAFKA-7660?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16697057#comment-16697057
 ]


Patrik Kleindl commented on KAFKA-7660:
---------------------------------------

Hi [~vvcephei]

Thanks for taking a look.

I mainly reported it because it didn't match my expectations and thought it 
would be good if someone more knowledgeable could investigate if this could be 
some kind of leak.

The memory impact is not a problem (yet), at least not compared to the buffer 
caches for streams etc.

I did not really find the memory leak you mentioned unless you mean 
https://issues.apache.org/jira/browse/KAFKA-7304

If you could point me to a certain class I can see if that matches, we are 
already using 2.0.0-cp1 and planning to switch to 2.0.1-cp1 at least.

 

Regarding String duplication: I did some reading and 
[https://stackoverflow.com/questions/34937046/why-is-that-a-concatenation-using-operator-of-a-java-reference-variable-and]
 confirms that runtime concatenation does not get interned unless explicitely 
done with String.intern().

Maybe org.apache.kafka.streams.kstream.internals.metrics.Sensors.java needs to 
use some final String variables here to fix that.

 

Regarding the sensors in general: Does this mean that Sensors/Metrics are only 
created at streams startup and should remain constant after that? Then at least 
I can check if the number of objects stays constant.

Or can Sensors etc. change dynamically? Otherwise I wouldn't understand the 
massive number of Strings involved.

I did a small test to see how many objects are created for a single KTable, do 
you have any reference how many are expected? I saw about 500 Sensor instances 
per KTable iirc.

 

> Stream Metrics - Memory Analysis
> --------------------------------
>
>                 Key: KAFKA-7660
>                 URL: https://issues.apache.org/jira/browse/KAFKA-7660
>             Project: Kafka
>          Issue Type: Bug
>          Components: metrics, streams
>    Affects Versions: 2.0.0
>            Reporter: Patrik Kleindl
>            Priority: Minor
>         Attachments: Mem_Collections.jpeg, Mem_DuplicateStrings.jpeg, 
> Mem_DuplicateStrings2.jpeg, Mem_Hotspots.jpeg, Mem_KeepAliveSet.jpeg, 
> Mem_References.jpeg
>
>
> During the analysis of JVM memory two possible issues were shown which I 
> would like to bring to your attention:
> 1) Duplicate strings
> Top findings: 
> string_content="stream-processor-node-metrics" count="534,277"
> string_content="processor-node-id" count="148,437"
> string_content="stream-rocksdb-state-metrics" count="41,832"
> string_content="punctuate-latency-avg" count="29,681" 
>  
> "stream-processor-node-metrics"  seems to be used in Sensors.java as a 
> literal and not interned.
>  
> 2) The HashMap parentSensors from 
> org.apache.kafka.streams.processor.internals.StreamThread$StreamsMetricsThreadImpl
>  was reported multiple times as suspicious for potentially keeping alive a 
> lot of objects. In our case the reported size was 40-50MB each.
> I haven't looked too deep in the code but noticed that the class Sensor.java 
> which is used as a key in the HashMap does not implement equals or hashCode 
> method. Not sure this is a problem though.
>  
> The analysis was done with Dynatrace 7.0
> We are running Confluent 5.0/Kafka2.0-cp1 (Brokers as well as Clients)
>  
> Screenshots are attached.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (KAFKA-7660) Stream Metrics - Memory Analysis

Reply via email to