Spreading Tasks across TaskManagers

Maximilian Michels Thu, 11 Oct 2018 11:17:12 -0700

Hi everyone,

I've recently come across a cluster scheduling problem users are facing.Clusters where TaskManagers have more slots than the parallelism(#tm_slots > job_parallelism), tend to schedule all job tasks on asingle TaskManager.

This is not good for spreading load and has been discussed in FLINK-1003[1] and the other duplicate JIRA issues.

I know that this is not really an issue if the cluster is createdexclusively for the Job, or if the number of slots per Taskmanager issmaller than the parallelism. However, this seems like a rather easyimprovement to the Scheduler which would have a huge impact on performance.

On the JIRA issue page it has been mentioned that this was put on holdto work on dynamic scaling first.

Now that the basic building blocks for dynamic scaling are in place, doyou think it would be possible to tackle FLINK-1003?


Thanks,
Max


[1] https://issues.apache.org/jira/browse/FLINK-1003

Spreading Tasks across TaskManagers

Reply via email to