Stefan Richter created FLINK-7289:
-------------------------------------

             Summary: Memory allocation of RocksDB can be problematic in 
container environments
                 Key: FLINK-7289
                 URL: https://issues.apache.org/jira/browse/FLINK-7289
             Project: Flink
          Issue Type: Improvement
          Components: State Backends, Checkpointing
    Affects Versions: 1.3.0, 1.2.0, 1.4.0
            Reporter: Stefan Richter


Flink's RocksDB based state backend allocates native memory. The amount of 
allocated memory by RocksDB is not under the control of Flink or the JVM and 
can (theoretically) grow without limits.
In container environments, this can be problematic because the process can 
exceed the memory budget of the container, and the process will get killed. 
Currently, there is no other option than trusting RocksDB to be well behaved 
and to follow its memory configurations. However, limiting RocksDB's memory 
usage is not as easy as setting a single limit parameter. The memory limit is 
determined by an interplay of several configuration parameters, which is almost 
impossible to get right for users. Even worse, multiple RocksDB instances can 
run inside the same process and make reasoning about the configuration also 
dependent on the Flink job.

Some information about the memory management in RocksDB can be found here:
https://github.com/facebook/rocksdb/wiki/Memory-usage-in-RocksDB
https://github.com/facebook/rocksdb/wiki/RocksDB-Tuning-Guide

We should try to figure out ways to help users in one or more of the following 
ways:
- Some way to autotune or calculate the RocksDB configuration.
- Conservative default values.
- Additional documentation.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to