[ 
https://issues.apache.org/jira/browse/FLINK-22672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhu Zhu reassigned FLINK-22672:
-------------------------------

    Assignee: Jin Xing

> Some enhancements for pluggable shuffle service framework
> ---------------------------------------------------------
>
>                 Key: FLINK-22672
>                 URL: https://issues.apache.org/jira/browse/FLINK-22672
>             Project: Flink
>          Issue Type: Improvement
>          Components: Runtime / Network
>            Reporter: Jin Xing
>            Assignee: Jin Xing
>            Priority: Major
>             Fix For: 1.14.0
>
>
> "Pluggable shuffle service" in Flink provides an architecture which are 
> unified for both streaming and batch jobs, allowing user to customize the 
> process of data transfer between shuffle stages according to scenarios.
> There are already a number of implementations of "remote shuffle service" on 
> Spark like [1][2][3]. Remote shuffle enables to shuffle data from/to a remote 
> cluster and achieves benefits like :
>  # The lifecycle of computing resource can be decoupled with shuffle data, 
> once computing task is finished, idle computing nodes can be released with 
> its completed shuffle data accommodated on remote shuffle cluster.
>  # There is no need to reserve disk capacity for shuffle on computing nodes. 
> Remote shuffle cluster serves shuffling request with better scaling ability 
> and alleviates the local disk pressure on computing nodes when data skew.
> Based on "pluggable shuffle service", we build our own "remote shuffle 
> service" on Flink –- Lattice, which targets to provide functionalities and 
> improve performance for batch processing jobs. Basically it works as below:
>  # Lattice cluster works as an independent service for shuffling request;
>  # LatticeShuffleMaster extends ShuffleMaster, works inside JM and talks with 
> remote Lattice cluster for shuffle resource application and shuffle data 
> lifecycle management;
>  # LatticeShuffleEnvironment extends ShuffleEnvironment, works inside TM and 
> provides an environment for shuffling data from/to remote Lattice cluster;
> During the process of building Lattice we find some potential enhancements on 
> "pluggable shuffle service". I will enumerate and create some sub JIRAs under 
> this umbrella
>  
> [1] 
> [https://www.alibabacloud.com/blog/emr-remote-shuffle-service-a-powerful-elastic-tool-of-serverless-spark_597728]
> [2] [https://bestoreo.github.io/post/cosco/cosco/]
> [3] [https://github.com/uber/RemoteShuffleService]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to