Greg Hogan created FLINK-3164:
---------------------------------
Summary: Spread out scheduling strategy
Key: FLINK-3164
URL: https://issues.apache.org/jira/browse/FLINK-3164
Project: Flink
Issue Type: Improvement
Components: Distributed Runtime, Java API, Scala API
Affects Versions: 1.0.0
Reporter: Greg Hogan
The size of a Flink cluster is bounded by the amount of memory allocated for
network buffers. The all-to-all distribution of data during a network shuffle
means that doubling the number of TaskManager slots quadruples the required
number of network buffers.
A Flink job can be configured to execute operators with lower parallelism which
reduces the number of network buffers used across the cluster. Since the Flink
scheduler clusters tasks the number of network buffers to be configured cannot
be reduced.
For example, if each TaskManager has 32 slots and the cluster has 32
TaskManagers the maximum parallelism can be set to 1024. If the preceding
operator has a parallelism of 32 then the TaskManager fan-out is between 1*1024
(tasks evenly distributed) and 32*1024 (executed on a single TaskManager).
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)