[
https://issues.apache.org/jira/browse/FLINK-8886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16405723#comment-16405723
]
vinoyang commented on FLINK-8886:
---------------------------------
[~elevy] YARN has a feature named 'Node Label' since Apache Hadoop 2.6+ can
match your requirements. And the previous issue see *FLINK-7836* I have
supported the flink client to specify YARN node label expression. So you can
run your job on YARN.
Or for standalone cluster mode, you can just split a big cluster into some
smaller clusters base on dimensions of job / resource / business and so on .
I think introduce this feature into standalone cluster would make the scheduler
more heavier and complex. Because the standalone cluster's scheduler is not
good at resource assignment and management . Yarn and mesos would be better
choice.
[~till.rohrmann] What's your opinion?
> Job isolation via scheduling in shared cluster
> ----------------------------------------------
>
> Key: FLINK-8886
> URL: https://issues.apache.org/jira/browse/FLINK-8886
> Project: Flink
> Issue Type: Improvement
> Components: Distributed Coordination, Local Runtime, Scheduler
> Affects Versions: 1.5.0
> Reporter: Elias Levy
> Priority: Major
>
> Flink's TaskManager executes tasks from different jobs within the same JMV as
> threads. We prefer to isolate different jobs on their on JVM. Thus, we must
> use different TMs for different jobs. As currently the scheduler will
> allocate task slots within a TM to tasks from different jobs, that means we
> must stand up one cluster per job. This is wasteful, as it requires at least
> two JobManagers per cluster for high-availability, and the JMs have low
> utilization.
> Additionally, different jobs may require different resources. Some jobs are
> compute heavy. Some are IO heavy (lots of state in RocksDB). At the moment
> the scheduler threats all TMs are equivalent, except possibly in their number
> of available task slots. Thus, one is required to stand up multiple cluster
> if there is a need for different types of TMs.
>
> It would be useful if one could specify requirements on job, such that they
> are only scheduled on a subset of TMs. Properly configured, that would
> permit isolation of jobs in a shared cluster and scheduling of jobs with
> specific resource needs.
>
> One possible implementation is to specify a set of tags on the TM config file
> which the TMs used when registering with the JM, and another set of tags
> configured within the job or supplied when submitting the job. The scheduler
> could then match the tags in the job with the tags in the TMs. In a
> restrictive mode the scheduler would assign a job task to a TM only if all
> tags match. In a relaxed mode the scheduler could assign a job task to a TM
> if there is a partial match, while giving preference to a more accurate match.
>
>
>
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)