[
https://issues.apache.org/jira/browse/FLINK-3667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15305118#comment-15305118
]
ASF GitHub Bot commented on FLINK-3667:
---------------------------------------
Github user EronWright commented on the pull request:
https://github.com/apache/flink/pull/1978#issuecomment-222285524
This PR dovetails nicely with the Mesos work and I'll be sure to build on
it. Here's a few suggestions to align it even further.
The problem of _managing_ a Flink cluster is mostly independent from
_using_ a cluster to submit and manage jobs. I would like to see the two
concerns be cleanly separated. In this PR, the `ClusterDescriptor` handles
creating the cluster, then produces a `Client` with which to manage jobs and to
handle shutdown. I suggest that a new component - the `YarnDispatcher` - be
introduced to handle all lifecycle operations for a cluster. Make the
`ClusterDescriptor` be an entity class that is given to the dispatcher.
A related issue is that its only possible to use the `YarnClusterClient` to
interact with a newly-created YARN session, not a pre-existing one. When
submitting a job to an existing YARN session, seems the
`StandaloneClusterClient` is used (by supplying a JM endpoint) - is that true?
Eventually the CLI should provide a nice way to discover and use existing
YARN sessions.
The `detached` flags could use clarification. In the `Client` context,
the detached concept seems related to interactivity with the job (tailing the
status messages, etc). I don't think it should imply anything about the
lifecycle of the cluster; leave that to the dispatcher. The `stopAfterJob`
method should move accordingly to the dispatcher.
How this relates to Mesos is, the `MesosDispatcher` component will run in
the Mesos cluster and be accessed remotely by the CLI. The
`ClusterDescriptor` will be passed via REST to it. Everything will fit
nicely. :)
> Generalize client<->cluster communication
> -----------------------------------------
>
> Key: FLINK-3667
> URL: https://issues.apache.org/jira/browse/FLINK-3667
> Project: Flink
> Issue Type: Improvement
> Components: YARN Client
> Reporter: Maximilian Michels
> Assignee: Maximilian Michels
>
> Here are some notes I took when inspecting the client<->cluster classes with
> regard to future integration of other resource management frameworks in
> addition to Yarn (e.g. Mesos).
> {noformat}
> 1 Cluster Client Abstraction
> ════════════════════════════
> 1.1 Status Quo
> ──────────────
> 1.1.1 FlinkYarnClient
> ╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌
> • Holds the cluster configuration (Flink-specific and Yarn-specific)
> • Contains the deploy() method to deploy the cluster
> • Creates the Hadoop Yarn client
> • Receives the initial job manager address
> • Bootstraps the FlinkYarnCluster
> 1.1.2 FlinkYarnCluster
> ╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌
> • Wrapper around the Hadoop Yarn client
> • Queries cluster for status updates
> • Life time methods to start and shutdown the cluster
> • Flink specific features like shutdown after job completion
> 1.1.3 ApplicationClient
> ╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌
> • Acts as a middle-man for asynchronous cluster communication
> • Designed to communicate with Yarn, not used in Standalone mode
> 1.1.4 CliFrontend
> ╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌
> • Deeply integrated with FlinkYarnClient and FlinkYarnCluster
> • Constantly distinguishes between Yarn and Standalone mode
> • Would be nice to have a general abstraction in place
> 1.1.5 Client
> ╌╌╌╌╌╌╌╌╌╌╌╌
> • Job submission and Job related actions, agnostic of resource framework
> 1.2 Proposal
> ────────────
> 1.2.1 ClusterConfig (before: AbstractFlinkYarnClient)
> ╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌
> • Extensible cluster-agnostic config
> • May be extended by specific cluster, e.g. YarnClusterConfig
> 1.2.2 ClusterClient (before: AbstractFlinkYarnClient)
> ╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌
> • Deals with cluster (RM) specific communication
> • Exposes framework agnostic information
> • YarnClusterClient, MesosClusterClient, StandaloneClusterClient
> 1.2.3 FlinkCluster (before: AbstractFlinkYarnCluster)
> ╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌
> • Basic interface to communicate with a running cluster
> • Receives the ClusterClient for cluster-specific communication
> • Should not have to care about the specific implementations of the
> client
> 1.2.4 ApplicationClient
> ╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌
> • Can be changed to work cluster-agnostic (first steps already in
> FLINK-3543)
> 1.2.5 CliFrontend
> ╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌
> • CliFrontend does never have to differentiate between different
> cluster types after it has determined which cluster class to load.
> • Base class handles framework agnostic command line arguments
> • Pluggables for Yarn, Mesos handle specific commands
> {noformat}
> I would like to create/refactor the affected classes to set us up for a more
> flexible client side resource management abstraction.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)