Adam Binford created SPARK-41339:
------------------------------------

             Summary: RocksDB State Store WriteBatch doesn't cleanup native 
memory
                 Key: SPARK-41339
                 URL: https://issues.apache.org/jira/browse/SPARK-41339
             Project: Spark
          Issue Type: Bug
          Components: SQL, Structured Streaming
    Affects Versions: 3.3.1
            Reporter: Adam Binford


The RocksDB state store uses a WriteBatch to hold updates that get written in a 
single transaction to commit. Somewhat indirectly abort is called after a 
successful task which calls writeBatch.clear(), but the data for a writeBatch 
is stored in a std::string in the native code. Not sure why it's stored as a 
string, but it is. [rocksdb/write_batch.h at main · facebook/rocksdb · 
GitHub|https://github.com/facebook/rocksdb/blob/main/include/rocksdb/write_batch.h#L491]

writeBatch.clear simply calls rep_.clear() and rep._resize() 
([rocksdb/write_batch.cc at main · facebook/rocksdb · 
GitHub|https://github.com/facebook/rocksdb/blob/main/db/write_batch.cc#L246-L247]),
 neither of which actually releases the memory built up by a std::string 
instance. The only way to actually release this memory is to delete the 
WriteBatch object itself.

Currently, all memory taken by all write batches will remain until the RocksDB 
state store instance is closed, which never happens during the normal course of 
operation as all partitions remain loaded on an executor after a task completes.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to