Hi Han, Thanks for your answers! Please see my comments below:
2. The deployment details are omitted in the FLIP to maintain focus on the > core interaction between Flink subtasks and the compaction service (rather > than the service implementation details). The proposed design follows a > distributed deployment model with components communicating through RPC. The > dispatcher and workers can run in different processes or containers. > Therefore the service can be independently scalable from the Flink jobs. > The deployment can be quite flexible as the workers can be placed within or > separated from the Flink cluster (or even colocated with DFS data nodes). > The service configs are also customizable, e.g. tailored scheduling > policies or specialized IO settings for compaction workloads. If you plan to do `flink-console.sh forst-compaction-service`, I'd suggest providing more details about how it would be configured, and if the service would start based on resource management like yarn/k8s, or some questions like that. This Remote Compactor is a Flink component as I thought, although its implementation seems to be a wrapper of ForSt capabilities. It is worth a plan & detail. Best, Zakelly On Fri, Sep 5, 2025 at 12:39 PM Han Yin <[email protected]> wrote: > Hi Zakelly, > > Thanks for your feedback! I address the comments in the following: > > 1. Good point! I’ve updated the REST API in the FLIP to make the URL more > specific. > 2. The deployment details are omitted in the FLIP to maintain focus on the > core interaction between Flink subtasks and the compaction service (rather > than the service implementation details). The proposed design follows a > distributed deployment model with components communicating through RPC. The > dispatcher and workers can run in different processes or containers. > Therefore the service can be independently scalable from the Flink jobs. > The deployment can be quite flexible as the workers can be placed within or > separated from the Flink cluster (or even colocated with DFS data nodes). > The service configs are also customizable, e.g. tailored scheduling > policies or specialized IO settings for compaction workloads. > > Best regards, > Han Yin > > > 2025年8月28日 19:28,Zakelly Lan <[email protected]> 写道: > > > > Hi Han, > > > > Thanks for your proposal! Remote compaction decouples compaction from > > computation, which is another great step toward cloud-native > architecture. > > I have a few questions: > > > > 1. I’d suggest including 'forst' in the URL used to update the remote > > compaction service endpoint, since this functionality is specific to > > ForStStateBackend. > > 2. What is the deployment model for the compaction service components > > (e.g., dispatcher and workers)? Do they run in the same process or > > container? How could we customize the setup of that service? > > > > Best, > > Zakelly > > > > On Thu, Aug 28, 2025 at 12:25 PM Han Yin <[email protected]> wrote: > > > >> Hi everyone, > >> > >> I would like to open a discussion on introducing remote compaction for > >> disaggregated state[1]. > >> > >> Flink state backends rely on LSM-Trees for large-scale storage, with > file > >> compaction executed locally in TaskManager background threads. This > >> co-location creates local resource contention, causing latency spikes > and > >> resource instability. > >> > >> Flink 2.0 introduces disaggregated state management through the ForSt > >> StateBackend[2], employing a shared DFS as primary storage. This allows > >> ForSt to implement compaction-as-a-service (Remote Compaction) through > >> dedicated compaction workers. > >> > >> This approach can clearly separate the responsibilities between > computing > >> and storage nodes, therefore further complement Flink's disaggregated > >> architecture. Introducing a compaction service aligns with the pooling > >> concept prevalent in the cloud-native era, and can significantly improve > >> the resource efficiency and elasticity of Flink stateful jobs. > >> > >> Looking forward to your comments or feedback. Best regards, > >> Han Yin > >> > >> [1] > >> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-430%3A+Remote+Compaction+For+Disaggregated+State > >> [2] https://cwiki.apache.org/confluence/x/R4p3EQ > >> > >> > >
