[jira] [Commented] (FLINK-4319) Rework Cluster Management (FLIP-6)

Till Rohrmann (JIRA) Sun, 22 Jul 2018 07:04:08 -0700


    [ 
https://issues.apache.org/jira/browse/FLINK-4319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16552024#comment-16552024
 ]


Till Rohrmann commented on FLINK-4319:
--------------------------------------

Hi [~liyinan926], Flink's Kubernetes support is not yet fully completed and the 
community is still working on it.

There are actually two operation modes which we would like to support. The 
first one is similar to Flink's Yarn and Mesos integration where the 
{{ResourceManager}} is able to talk to the Kubernetes master to start new pods 
if needed. I actually have a dev branch where I prototyped a 
{{KubernetesResourceManager}} which you can find 
[here|https://github.com/tillrohrmann/flink/tree/nativeKubernetes].

The other mode builds on FLINK-7087 and moves the responsibility of allocating 
new resources/pods to the user. In this mode, the {{ResourceManager}} won't 
start new {{TaskExecutors}} but simply accepts all slots from running 
{{TaskExecutors}} and advertises them to the {{JobMaster}}. The {{JobMaster}} 
will then scale the job to the currently available number of slots. That way 
the user can decide whether his job needs some more resources by starting new 
pods (with a {{TaskExecutor}}) or less resources by stopping pods.

At the moment, the latter mode only works without the adaptive scaling of jobs. 
This means that one can start the job with a determined parallelism and then 
one has to make sure that there are enough {{TaskExecutors}} which offer enough 
slots to execute the job.

> Rework Cluster Management (FLIP-6)
> ----------------------------------
>
>                 Key: FLINK-4319
>                 URL: https://issues.apache.org/jira/browse/FLINK-4319
>             Project: Flink
>          Issue Type: Improvement
>          Components: Cluster Management
>    Affects Versions: 1.1.0
>            Reporter: Stephan Ewen
>            Assignee: Till Rohrmann
>            Priority: Major
>              Labels: flip-6
>             Fix For: 1.5.0
>
>
> This is the root issue to track progress of the rework of cluster management 
> (FLIP-6) 
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=65147077



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (FLINK-4319) Rework Cluster Management (FLIP-6)

Reply via email to