Hi, Thanks for joining the discussion and giving the ideas. @ron > can the Hybrid Shuffle replace the RSS in the future?
The hybrid shuffle and RSS offer distinct solutions to address the shuffle operation challenge. To optimize performance, we store shuffle data in different tiers of memory and disk, enabling greater flexibility and ease of use. Specifically, we cache intermediate data in memory to reduce disk I/O overhead. In contrast, RSS is a standalone service that can operate across multiple servers within a cluster, parallelizing shuffle operations to enhance performance. However, this introduces additional deployment and maintenance costs. Each approach has its own benefits and drawbacks, and users should be able to select the method that best suits their needs. So I think we cannot replace RSS in the future. @ConradJam > Should we define a data acceleration layer like Alluxio in remote storage? I'm not entirely clear on the detailed plan you've proposed, but I understand that you want to use Alluxio to serve as a cache layer for the remote stoarge tier. It's designed to provide low-latency data access to applications through a distributed caching layer. However, implementing Alluxio introduces additional dependencies and deployment/maintenance costs for users. While our design approach is to supplement local storage with remote storage, as local storage is generally sufficient. Given the limited usage scenarios, introducing such costs for optimization may not be worthwhile or meaningful. Additionally, for users, added dependencies imply increased complexity. Best, Yuxin ConradJam <jam.gz...@gmail.com> 于2023年3月17日周五 11:11写道: > Thanks for your start this discuss > > > Here I am a bit confused about the memory layer definition. This refers to > local memory. Should we define a data acceleration layer like Alluxio [1] > in remote storage? > > > Let me cite a scenario: If I use Fluid [2] to mount an AlluxioRuntime [3] > on K8S, it looks like a local disk (but it is actually a remote memory > storage), Have we specified this behavior or optimized it for this > scenario? > > > [1] What is alluxio : > https://docs.alluxio.io/os/user/stable/en/Overview.html > > [2] Fluid: https://fluid-cloudnative.github.io/ > > [3] Fluid Alluxio Runtime: > > https://fluid-cloudnative.github.io/samples/tieredstore_config.html#prerequisites > > liu ron <ron9....@gmail.com> 于2023年3月17日周五 10:39写道: > > > Hi, Yuxin, > > > > Thanks for creating this FLIP. Adding remote storage capability to > Flink's > > Hybrid Shuffle is a significant improvement that addresses the issue of > > local disk storage limitations, this also can improve the stability of > > Flink Batch Job. > > I just have one question: can the Hybrid Shuffle replace the RSS in the > > future? Due to the Hybrid Shuffle having remote storage ability, I think > > maybe we don't need to maintain a standalone RSS, it will simplify our > > operation work. > > > > > -- > Best > > ConradJam >