[ 
https://issues.apache.org/jira/browse/FLINK-15959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17040197#comment-17040197
 ] 

Till Rohrmann commented on FLINK-15959:
---------------------------------------

I would strongly discourage to not touch any of the scheduling components on 
the {{JobManager}} side because of the above-mentioned reasons. They are being 
actively worked on and will change quite significantly in the foreseeable 
future.

Instead I would suggest the following (approximative) solution with the 
corresponding approach:

1. Introduce the {{cluster.number-of-slots.min}} and 
{{cluster.number-of-slots.max}} configuration options.
2. Make the {{SlotManager}} respect the min and max number of slots but do not 
block the scheduling on whether {{min}} has reached
3. Have a special {{SlotManager}} implementation which only starts fulfilling 
slot requests if we have equal or more slots acquired than the configured 
minimum

With 1+2 we can already solve the case where we have a Yarn session cluster 
which has enough time to acquire the required solutions before jobs are being 
submitted. With 3. it should also work for the per-job cluster.

The case where we won't necessarily accomplish a perfect scheduling might be 
the failover case. But I think this is fine because one design principle of the 
existing scheduler was that all slots can be treated equally (modulo their 
resource specifications). And in doubt, it is better to make some progress with 
a not optimal scheduling than to make no progress while waiting on the perfect 
scheduling.

> Add min/max number of slots configuration to limit total number of slots
> ------------------------------------------------------------------------
>
>                 Key: FLINK-15959
>                 URL: https://issues.apache.org/jira/browse/FLINK-15959
>             Project: Flink
>          Issue Type: Improvement
>          Components: Runtime / Coordination
>    Affects Versions: 1.11.0
>            Reporter: YufeiLiu
>            Priority: Major
>
> Flink removed `-n` option after FLIP-6, change to ResourceManager start a new 
> worker when required. But I think maintain a certain amount of slots is 
> necessary. These workers will start immediately when ResourceManager starts 
> and would not release even if all slots are free.
> Here are some resons:
> # Users actually know how many resources are needed when run a single job, 
> initialize all workers when cluster starts can speed up startup process.
> # Job schedule in  topology order,  next operator won't schedule until prior 
> execution slot allocated. The TaskExecutors will start in several batchs in 
> some cases, it might slow down the startup speed.
> # Flink support 
> [FLINK-12122|https://issues.apache.org/jira/browse/FLINK-12122] [Spread out 
> tasks evenly across all available registered TaskManagers], but it will only 
> effect if all TMs are registered. Start all TMs at begining can slove this 
> problem.
> *suggestion:*
> * Add config "taskmanager.minimum.numberOfTotalSlots" and 
> "taskmanager.maximum.numberOfTotalSlots", default behavior is still like 
> before.
> * Start plenty number of workers to satisfy minimum slots when 
> ResourceManager accept leadership(subtract recovered workers).
> * Don't comlete slot request until minimum number of slots are registered, 
> and throw exeception when exceed maximum.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to