i was reading a bit about RocksDb and it seems the Java version is somewhat
particular about how it should be cleaned up to ensure all resources are
cleaned up:

<https://github.com/facebook/rocksdb/wiki/RocksJava-Basics#memory-management>
ttps://github.com/facebook/rocksdb/wiki/RocksJava-Basics#memory-management
<https://github.com/facebook/rocksdb/wiki/RocksJava-Basics#memory-management>

   - "Many of the Java Objects used in the RocksJava API will be backed by
   C++ objects for which the Java Objects have ownership. As C++ has no notion
   of automatic garbage collection for its heap in the way that Java does, we
   must explicitly free the memory used by the C++ objects when we are
   finished with them."

Column families also have a specific close procedure

<https://github.com/facebook/rocksdb/wiki/RocksJava-Basics#opening-a-database-with-column-families>
https://github.com/facebook/rocksdb/wiki/RocksJava-Basics#opening-a-database-with-column-families

   - "It is important to note that when working with Column Families in
   RocksJava, there is a very specific order of destruction that must be
   obeyed for the database to correctly free all resources and shutdown."

When a running job fails and a running TaskManager restores from
checkpoint, is the old Embedded RocksDb being cleaned up properly? I wasn't
really sure where to look in the Flink source code to verify this.

On Mon, Oct 4, 2021 at 4:56 PM Kevin Lam <kevin....@shopify.com> wrote:

> We tried with 1.14.0, unfortunately we still run into the issue. Any
> thoughts or suggestions?
>
> On Mon, Oct 4, 2021 at 9:09 AM Kevin Lam <kevin....@shopify.com> wrote:
>
>> Hi Fabian,
>>
>> We're using our own image built from the official Flink docker image, so
>> we should have the code to use jemalloc in the docker entrypoint.
>>
>> I'm going to give 1.14 a try and will let you know how it goes.
>>
>> On Mon, Oct 4, 2021 at 8:29 AM Fabian Paul <fabianp...@ververica.com>
>> wrote:
>>
>>> Hi Kevin,
>>>
>>> We bumped the RocksDb version with Flink 1.14 which we thought increases
>>> the memory control [1]. In the past we also saw problems with the allocator
>>> used of the OS. We switched to use jemalloc within our docker images which
>>> has a better memory fragmentation [2]. Are you using the official Flink
>>> docker image or did you build your own?
>>>
>>> I am also pulling in yun tang who is more familiar with Flinkā€™s state
>>> backend. Maybe he has an immediate idea about your problem.
>>>
>>> Best,
>>> Fabian
>>>
>>>
>>> [1] https://issues.apache.org/jira/browse/FLINK-14482
>>> [2]
>>> https://lists.apache.org/thread.html/r596a19f8cf7278bcf9e30c3060cf00562677d4be072050444a5caf99%40%3Cdev.flink.apache.org%3E
>>> <https://lists.apache.org/thread.html/r596a19f8cf7278bcf9e30c3060cf00562677d4be072050444a5caf99@%3Cdev.flink.apache.org%3E>
>>>
>>>
>>>

Reply via email to