[ https://issues.apache.org/jira/browse/KAFKA-10396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Matthias J. Sax reassigned KAFKA-10396: --------------------------------------- Assignee: Rohan Desai > Overall memory of container keep on growing due to kafka stream / rocksdb and > OOM killed once limit reached > ----------------------------------------------------------------------------------------------------------- > > Key: KAFKA-10396 > URL: https://issues.apache.org/jira/browse/KAFKA-10396 > Project: Kafka > Issue Type: Bug > Components: streams > Affects Versions: 2.3.1, 2.5.0 > Reporter: Vagesh Mathapati > Assignee: Rohan Desai > Priority: Critical > Attachments: CustomRocksDBConfig.java, MyStreamProcessor.java, > kafkaStreamConfig.java > > > We are observing that overall memory of our container keep on growing and > never came down. > After analysis find out that rocksdbjni.so is keep on allocating 64M chunks > of memory off-heap and never releases back. This causes OOM kill after memory > reaches configured limit. > We use Kafka stream and globalktable for our many kafka topics. > Below is our environment > * Kubernetes cluster > * openjdk 11.0.7 2020-04-14 LTS > * OpenJDK Runtime Environment Zulu11.39+16-SA (build 11.0.7+10-LTS) > * OpenJDK 64-Bit Server VM Zulu11.39+16-SA (build 11.0.7+10-LTS, mixed mode) > * Springboot 2.3 > * spring-kafka-2.5.0 > * kafka-streams-2.5.0 > * kafka-streams-avro-serde-5.4.0 > * rocksdbjni-5.18.3 > Observed same result with kafka 2.3 version. > Below is the snippet of our analysis > from pmap output we took addresses from these 64M allocations (RSS) > Address Kbytes RSS Dirty Mode Mapping > 00007f3ce8000000 65536 65532 65532 rw--- [ anon ] > 00007f3cf4000000 65536 65536 65536 rw--- [ anon ] > 00007f3d64000000 65536 65536 65536 rw--- [ anon ] > We tried to match with memory allocation logs enabled with the help of Azul > systems team. > @ /tmp/librocksdbjni6564497922441568920.so: > _Z18rocksdb_get_helperP7JNIEnv_PN7rocksdb2DBERKNS1_11ReadOptionsEPNS1_18ColumnFamilyHandleEP11_jbyteArrayii+0x261)[0x7f3e1c65d741] > - 0x7f3ce8ff7ca0 > @ /tmp/librocksdbjni6564497922441568920.so: > _ZN7rocksdb15BlockBasedTable3GetERKNS_11ReadOptionsERKNS_5SliceEPNS_10GetContextEPKNS_14SliceTransformEb+0x894)[0x7f3e1c898fd4] > - 0x7f3ce8ff9780 > @ /tmp/librocksdbjni6564497922441568920.so: > _Z18rocksdb_get_helperP7JNIEnv_PN7rocksdb2DBERKNS1_11ReadOptionsEPNS1_18ColumnFamilyHandleEP11_jbyteArrayii+0xfa)[0x7f3e1c65d5da] > - 0x7f3ce8ff9750 > @ /tmp/librocksdbjni6564497922441568920.so: > _Z18rocksdb_get_helperP7JNIEnv_PN7rocksdb2DBERKNS1_11ReadOptionsEPNS1_18ColumnFamilyHandleEP11_jbyteArrayii+0x261)[0x7f3e1c65d741] > - 0x7f3ce8ff97c0 > @ > /tmp/librocksdbjni6564497922441568920.so:_Z18rocksdb_get_helperP7JNIEnv_PN7rocksdb2DBERKNS1_11ReadOptionsEPNS1_18ColumnFamilyHandleEP11_jbyteArrayii+0xfa)[0x7f3e1c65d5da] > - 0x7f3ce8ffccf0 > @ /tmp/librocksdbjni6564497922441568920.so: > _Z18rocksdb_get_helperP7JNIEnv_PN7rocksdb2DBERKNS1_11ReadOptionsEPNS1_18ColumnFamilyHandleEP11_jbyteArrayii+0x261)[0x7f3e1c65d741] > - 0x7f3ce8ffcd10 > @ /tmp/librocksdbjni6564497922441568920.so: > _Z18rocksdb_get_helperP7JNIEnv_PN7rocksdb2DBERKNS1_11ReadOptionsEPNS1_18ColumnFamilyHandleEP11_jbyteArrayii+0xfa)[0x7f3e1c65d5da] > - 0x7f3ce8ffccf0 > @ /tmp/librocksdbjni6564497922441568920.so: > _Z18rocksdb_get_helperP7JNIEnv_PN7rocksdb2DBERKNS1_11ReadOptionsEPNS1_18ColumnFamilyHandleEP11_jbyteArrayii+0x261)[0x7f3e1c65d741] > - 0x7f3ce8ffcd10 > We also identified that content on this 64M is just 0s and no any data > present in it. > I tried to tune the rocksDB configuratino as mentioned but it did not helped. > [https://docs.confluent.io/current/streams/developer-guide/config-streams.html#streams-developer-guide-rocksdb-config] > > Please let me know if you need any more details -- This message was sent by Atlassian Jira (v8.3.4#803005)