Fabian Hueske created FLINK-9289:
------------------------------------
Summary: Parallelism of generated operators should have max
parallism of input
Key: FLINK-9289
URL: https://issues.apache.org/jira/browse/FLINK-9289
Project: Flink
Issue Type: Bug
Components: DataSet API
Affects Versions: 1.4.2, 1.5.0, 1.6.0
Reporter: Fabian Hueske
The DataSet API aims to chain generated operators such as key extraction
mappers to their predecessor. This is done by assigning the same parallelism as
the input operator.
If a generated operator has more than two inputs, the operator cannot be
chained anymore and the operator is generated with default parallelism. This
can lead to a {code}NoResourceAvailableException: Not enough free slots
available to run the job.{code} as reported by a user on the mailing list:
https://lists.apache.org/thread.html/60a8bffcce54717b6273bf3de0f43f1940fbb711590f4b90cd666c9a@%3Cuser.flink.apache.org%3E
I suggest to set the parallelism of a generated operator to the max parallelism
of all of its inputs to fix this problem.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)