Jinzhong Li created FLINK-34981: ----------------------------------- Summary: FLIP-426: Grouping Remote State Access Key: FLINK-34981 URL: https://issues.apache.org/jira/browse/FLINK-34981 Project: Flink Issue Type: New Feature Components: Runtime / State Backends Reporter: Jinzhong Li Fix For: 2.0.0
This is a sub-FLIP for the disaggregated state management and its related work, please read the [FLIP-423|https://cwiki.apache.org/confluence/x/R4p3EQ] first to know the whole story. I/O speed and latency are critical for overall data throughput, particularly in jobs that manage large states. Implementing multiple asynchronous I/O operations is a proven strategy to enhance throughput by increasing parallelism of I/O execution. However, simply expanding I/O parallelism can quickly hit a ceiling due to finite I/O bandwidth. Additionally, when it comes to remote storage access, the time taken for RPC round trips significantly outweighs the impact of I/O size on individual I/O performance. So a promising optimization is to merge adjacent I/O requests into a single operation and fetch multiple keys with one I/O call. This approach requires a pre-prepared batch of keys for the query and the identification of I/O operations that can be combined. In this FLIP, we focus on the implementation details for batching state requests and processing them in batches. -- This message was sent by Atlassian Jira (v8.20.10#820010)