[ 
https://issues.apache.org/jira/browse/FLINK-15959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17039668#comment-17039668
 ] 

Xintong Song commented on FLINK-15959:
--------------------------------------

[~liuyufei]

bq. otherwise scheduler will use slots from SlotPool until available slots are 
empty, this also can cause load balance issue.

Not sure about this. In case of job failover, all the slots in {{SlotPool}} 
should be free when recovering the job, and the `evenly-spread-out-slots` 
introduced in FLINK-12122 applies to those slots in {{SlotPool}} as well.

It's true that we cannot always guarantee to evenly spread out slots across all 
the TMs in case of TM lost. We may first spread out slot requests across the 
existing TMs (whose slots are still in {{SlotPool}}), and the remaining 
unfulfilled slot requests will be allocated to new TMs once the minimum slots 
are registered to RM. I think this might provide good enough load balance in 
most cases, especially in failovers cased by only one or a few TMs are lost. 

bq. maybe should implement it in scheduler, allocate enough resource before 
execution scheduling.

I think this is a good idea for solving the load balance problem. My concern it 
that, it might be conflict with some of the community road map plans.
- One thing I'm aware of is that, the community is planning for a declarative 
resource management approach, in which a job declares its all resource 
requirements at once, instead of currently requesting slots for individual 
task/execution separately. This effort is still in early design discussions, 
and may not be finished in the next release. But I would try to avoid making 
many changes to {{Scheduler}} and {{SlotPool}} at this time considering they 
might be changed again in short time.
- Another thing that I heard of, not completely sure about this, is that people 
are considering getting rid of {{SlotPool}}, or at least make it as less 
responsibility as possible. Because currently we do not benefit much from 
caching slots in {{SlotPool}}, but suffers from the complication that resources 
are managed at two places, the {{SlotPool}} and the {{ResourceManager}}. That's 
also why I do not like the idea of adding more responsibility to {{SlotPool}}.

I'm not very familiar {{Scheduler}} and {{SlotPool}} though. Maybe [~trohrmann] 
or [~zhuzh] could chime in.

> Add min/max number of slots configuration to limit total number of slots
> ------------------------------------------------------------------------
>
>                 Key: FLINK-15959
>                 URL: https://issues.apache.org/jira/browse/FLINK-15959
>             Project: Flink
>          Issue Type: Improvement
>          Components: Runtime / Coordination
>    Affects Versions: 1.11.0
>            Reporter: YufeiLiu
>            Priority: Major
>
> Flink removed `-n` option after FLIP-6, change to ResourceManager start a new 
> worker when required. But I think maintain a certain amount of slots is 
> necessary. These workers will start immediately when ResourceManager starts 
> and would not release even if all slots are free.
> Here are some resons:
> # Users actually know how many resources are needed when run a single job, 
> initialize all workers when cluster starts can speed up startup process.
> # Job schedule in  topology order,  next operator won't schedule until prior 
> execution slot allocated. The TaskExecutors will start in several batchs in 
> some cases, it might slow down the startup speed.
> # Flink support 
> [FLINK-12122|https://issues.apache.org/jira/browse/FLINK-12122] [Spread out 
> tasks evenly across all available registered TaskManagers], but it will only 
> effect if all TMs are registered. Start all TMs at begining can slove this 
> problem.
> *suggestion:*
> * Add config "taskmanager.minimum.numberOfTotalSlots" and 
> "taskmanager.maximum.numberOfTotalSlots", default behavior is still like 
> before.
> * Start plenty number of workers to satisfy minimum slots when 
> ResourceManager accept leadership(subtract recovered workers).
> * Don't comlete slot request until minimum number of slots are registered, 
> and throw exeception when exceed maximum.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to