[ https://issues.apache.org/jira/browse/FLINK-21694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17298672#comment-17298672 ]
Yun Tang commented on FLINK-21694: ---------------------------------- I think increasing the default value is a good idea. The only left part is that whether 8 is a bit large as too many concurrent clients reading from DFS would cause real pressure to non-object-store DFS. BTW, [~qinjunjerry] I think the guy has not understanding the correct meaning of {{state.backend.rocksdb.thread.num}}, which indicates the background threads of RocksDB for flushing and compaction. Increase this value would have no impacts on speeding up for restoring, but give more resources for background flush and compaction. > Increase default value of > "state.backend.rocksdb.checkpoint.transfer.thread.num" > -------------------------------------------------------------------------------- > > Key: FLINK-21694 > URL: https://issues.apache.org/jira/browse/FLINK-21694 > Project: Flink > Issue Type: Improvement > Components: Runtime / State Backends > Reporter: Stephan Ewen > Priority: Critical > Fix For: 1.13.0 > > > The default value for the number of threads used to download state artifacts > from checkpoint storage should be increased. > The increase should not pose risk of regression, but does in many cases speed > up checkpoint recovery significantly. > Something similar was reported in this blog post, item (3). > https://engineering.contentsquare.com/2021/ten-flink-gotchas/ > A default value of 8 (eight) sounds like a good default. It should not result > in excessive thread explosion, and already speeds up recovery. -- This message was sent by Atlassian Jira (v8.3.4#803005)