[ https://issues.apache.org/jira/browse/KAFKA-16086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17803506#comment-17803506 ]
Nicholas Telford edited comment on KAFKA-16086 at 1/5/24 10:45 AM: ------------------------------------------------------------------- As discussed on Slack: {{rocksdb::port::cacheline_aligned_alloc}} is called by {{StatisticsImpl}} _once per-core_ to allocate a block of memory for storing stats tickers. The size of this block of memory looks to be _at least_ 2112 bytes (enough to store 199 tickers and 62 histograms, aligned to the cache line size). For example, if the running machine has 16 cores, this would be 16*2112 = 33 KiB per-invocation. Our temporary {{Options}} object passes the global {{DBOptions}} object in its constructor. This invokes the copy-constructor on {{DBOptions}} copying the {{Statistics}} that was configured on {{{}DBOptions{}}}. Since we never {{close()}} the {{{}Options{}}}, this copied {{Statistics}} leaks. was (Author: nicktelford): As discussed on Slack: {{rocksdb::port::cacheline_aligned_alloc}} is called by {{StatisticsImpl}} _once per-core_ to allocate a block of memory for storing stats tickers. The size of this block of memory looks to be _at least_ 2112 bytes (enough to store 199 tickers and 62 histograms, aligned to the cache line size). For example, if the running machine has 16 cores, this would be 16*2112 = 33 KiB invocation. Our temporary {{Options}} object passes the global {{DBOptions}} object in its constructor. This invokes the copy-constructor on {{DBOptions}} copying the {{Statistics}} that was configured on {{{}DBOptions{}}}. Since we never {{close()}} the {{{}Options{}}}, this copied {{Statistics}} leaks. > Kafka Streams has RocksDB native memory leak > -------------------------------------------- > > Key: KAFKA-16086 > URL: https://issues.apache.org/jira/browse/KAFKA-16086 > Project: Kafka > Issue Type: Bug > Components: streams > Affects Versions: 3.7.0 > Reporter: Lucas Brutschy > Assignee: Nicholas Telford > Priority: Blocker > Labels: streams > Attachments: image.png > > > The current 3.7 and trunk versions are leaking native memory while running > Kafka streams over several hours. This will likely kill any real workload > over time, so this should be treated as a blocker bug for 3.7. > This is discovered in a long-running soak test. Attached is the memory > consumption, which steadily approaches 100% and then the JVM is killed. > Rerunning the same test with jemalloc native memory profiling, we see these > allocated objects after a few hours: > > {noformat} > (jeprof) top > Total: 13283138973 B > 10296829713 77.5% 77.5% 10296829713 77.5% > rocksdb::port::cacheline_aligned_alloc > 2487325671 18.7% 96.2% 2487325671 18.7% > rocksdb::BlockFetcher::ReadBlockContents > 150937547 1.1% 97.4% 150937547 1.1% > rocksdb::lru_cache::LRUHandleTable::LRUHandleTable > 119591613 0.9% 98.3% 119591613 0.9% prof_backtrace_impl > 47331433 0.4% 98.6% 105040933 0.8% > rocksdb::BlockBasedTable::PutDataBlockToCache > 32516797 0.2% 98.9% 32516797 0.2% rocksdb::Arena::AllocateNewBlock > 29796095 0.2% 99.1% 30451535 0.2% Java_org_rocksdb_Options_newOptions > 18172716 0.1% 99.2% 20008397 0.2% rocksdb::InternalStats::InternalStats > 16032145 0.1% 99.4% 16032145 0.1% > rocksdb::ColumnFamilyDescriptorJni::construct > 12454120 0.1% 99.5% 12454120 0.1% std::_Rb_tree::_M_insert_unique{noformat} > > > The first hypothesis is that this is caused by the leaking `Options` object > introduced in this line: > > [https://github.com/apache/kafka/blob/trunk/streams/src/main/java/org/apache/kafka/streams/state/internals/RocksDBStore.java#L312|https://github.com/apache/kafka/pull/14852] > > Introduced in this PR: > [https://github.com/apache/kafka/pull/14852|https://github.com/apache/kafka/pull/14852] -- This message was sent by Atlassian Jira (v8.20.10#820010)