[ 
https://issues.apache.org/jira/browse/FLINK-13246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantin Knauf reopened FLINK-13246:
--------------------------------------

Re-opening in accordance with https://issues.apache.org/jira/browse/FLINK-23206.

> Implement external shuffle service for Kubernetes
> -------------------------------------------------
>
>                 Key: FLINK-13246
>                 URL: https://issues.apache.org/jira/browse/FLINK-13246
>             Project: Flink
>          Issue Type: New Feature
>          Components: Runtime / Network
>            Reporter: MalcolmSanders
>            Assignee: MalcolmSanders
>            Priority: Minor
>              Labels: auto-closed, stale-assigned
>
> Flink batch job users could achieve better cluster utilization and job 
> throughput throught external shuffle service because the producers of 
> intermedia result partitions can be released once intermedia result 
> partitions have been persisted on disks. In 
> [FLINK-10653|https://issues.apache.org/jira/browse/FLINK-10653], [~zjwang] 
> has introduced pluggable shuffle manager architecture which abstracts the 
> process of data transfer between stages from flink runtime as shuffle 
> service. I propose to k8s implementation for flink external shuffle service.
> There are a few points needed to be discussed:
> (1) how to deploy external shuffle service in k8s?
> DaemonSet Vs. Sidecar mode
> (2) how to manage pv used for storing intermedia result partition data?
> Plan A: Shuffle servers(or other volume provisioners) provision pv, and 
> producers write to local pv;
> Plan B: Producers write to shuffle server through network, and let shuffle 
> server control the use of pv;
> (3) shuffle server could temporarily apply persistent storage backed by cloud 
> storages such as AWSElasticBlockStore, cephFs and etc.
> I'll bring a design document later.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to