Hello Alexis,

I don't think data in RocksDB resides in JVM even with function calls.

For more details, check the link below:
https://github.com/facebook/rocksdb/wiki/RocksDB-Overview#3-high-level-architecture

RocksDB has three main components - memtable, sstfile and WAL(not used in
Flink as Flink uses checkpointing). When TM starts with statebackend as
RocksDB,TM has its own RocksDB instance and the state is managed as column
Family by that TM. Any changes of state go into memtable --> sst-->
persistent store. When read, data goes to the buffers and cache of RocksDB.

In the case of RocksDB as state backend, JVM still holds threads stack as
for high degree of parallelism, there are many stacks maintaining separate
thread information.

Hope this helps!!





On Thu, Feb 15, 2024 at 11:21 AM Alexis Sarda-Espinosa <
sarda.espin...@gmail.com> wrote:

> Hi Asimansu
>
> The memory RocksDB manages is outside the JVM, yes, but the mentioned
> subsets must be bridged to the JVM somehow so that the data can be exposed
> to the functions running inside Flink, no?
>
> Regards,
> Alexis.
>
>
> On Thu, 15 Feb 2024, 14:06 Asimansu Bera, <asimansu.b...@gmail.com> wrote:
>
>> Hello Alexis,
>>
>> RocksDB resides off-heap and outside of JVM. The small subset of data
>> ends up on the off-heap in the memory.
>>
>> For more details, check the following link:
>>
>> https://nightlies.apache.org/flink/flink-docs-release-1.18/docs/deployment/memory/mem_setup_tm/#managed-memory
>>
>> I hope this addresses your inquiry.
>>
>>
>>
>>
>> On Thu, Feb 15, 2024 at 12:52 AM Alexis Sarda-Espinosa <
>> sarda.espin...@gmail.com> wrote:
>>
>>> Hello,
>>>
>>> Most info regarding RocksDB memory for Flink focuses on what's needed
>>> independently of the JVM (although the Flink process configures its limits
>>> and so on). I'm wondering if there are additional special considerations
>>> with regards to the JVM heap in the following scenario.
>>>
>>> Assuming a key used to partition a Flink stream and its state has a high
>>> cardinality, but that the state of each key is small, when Flink prepares
>>> the state to expose to a user function during a call (with a given
>>> partition key), I guess it loads only the required subset from RocksDB, but
>>> does this small subset end (temporarily) up on the JVM heap? And if it
>>> does, does it stay "cached" in the JVM for some time or is it immediately
>>> discarded after the user function completes?
>>>
>>> Maybe this isn't even under Flink's control, but I'm curious.
>>>
>>> Regards,
>>> Alexis.
>>>
>>

Reply via email to