Hi everyone, I was tested with this code to verify the batch impaction on memory. https://gist.github.com/zymap/19249ab35bb0f64c55cbf7f2e8356cb3
I found the memory keeps increasing with the batch size. And if the batch is not flushed into the sst, it will save into the WAL file, and the WAL file will not be limited by `max_total_wal_size`. If it was OOM killed because of the large batch and the batch was saved in the WAL. The only way to reopen the rocksDB is to add more memory for the bookie. I also talk this issue with rocksDB community, they said: > when the batch size is so large (esp if you run multiple batches together) the wal size may reach (limit + batch-size * number of open batches). We have a project opened by our friends from Kafka streams to handle huge batch size. In the meanwhile can you restrict the size of your batch ? -- In the Pulsar, the compacted ledger hasn't a rollover policy or retention policy. If the user has tons of keys in the compaction, that would make the compacted ledger bigger and bigger. In our environment, a compacted ledger reached 200G. It contains lots of entries in a single ledger, which makes the batch very large. In release 4.14.7 and branch-4.15, we didn't limit the delete operation numbers in a single batch. https://github.com/apache/bookkeeper/blob/c22136b03489db1643521f586e9cae2c4a511e10/bookkeeper-server/src/main/java/org/apache/bookkeeper/bookie/storage/ldb/EntryLocationIndex.java#L241 Finally, when bookie runs the garbage collection and removes the ledger, it will be OOM killed because of the large batch. Pulsar already has a proposal about configuring the compacted topic ledger retention, https://github.com/apache/pulsar/issues/19665. But I still think we also need to have a way to control the batch size to make sure we have a way to limit the memory. Here is the PR for this change: https://github.com/apache/bookkeeper/pull/4044 Best, Yong