Nicolas Perrin created FLINK-35210:
--------------------------------------
Summary: Give the option to set automatically the parallelism of
the KafkaSource to the number of kafka partitions
Key: FLINK-35210
URL: https://issues.apache.org/jira/browse/FLINK-35210
Project: Flink
Issue Type: Improvement
Components: Connectors / Kafka
Reporter: Nicolas Perrin
Currently the setting of the `KafkaSource` Flink's operator parallelism needs
to be manually chosen which can leads to highly skewed tasks if the developer
doesn't do this job.
To avoid this issue, I propose to:
- retrieve dynamically the number of partitions of the topic using
`KafkaConsumer.
partitionsFor(topic).size()`,
- set the parallelism of the stream built from the source based on this value.
This way there won't be any idle tasks.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)