Hi everyone,


I would like to start a discussion on FLIP-582: Support RpcOperator Service [1].


AI-oriented workloads like multimodal data processing and model inference are 
growing rapidly in recent years. These workloads are characterized by expensive 
resources (GPUs) and high initialization costs (seconds to minutes for model 
loading). In today's Flink, embedding them in the data plane couples their 
parallelism and failover with surrounding operators; deploying them as external 
services disconnects their lifecycle from the job and doubles operational 
overhead.


This FLIP introduces RpcOperator Service — a framework-level primitive that 
runs 
user-defined compute as RPC services in an independent Pipelined Region within 
the Flink job. Because the service is isolated at the scheduling level, it can 
achieve 
fault isolation, independent scaling, and dedicated resource allocation. As a 
native
Flink primitive, it also lays the foundation for automatic flow control, 
flexible load 
balancing, and coordinated auto-scaling — all without introducing external 
infrastructure or additional operational burden.




Looking forward to your feedback and suggestions!




[1]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-582%3A+Support+RpcOperator+Service





Best Regards,
Yi Zhang

Reply via email to