Thanks. It helps indeed. I guess the last point it does not explicitly answer is "does just creating a kafka consumer reading from multiple partition set the parallelism to the number of partitions". But reading between the lines I think this answer is clearly no. You have to set your parallelism yourself and then it will round robin between them.
Thanks again, -- Christophe On Mon, Feb 5, 2018 at 9:52 AM, Tzu-Li (Gordon) Tai <[email protected]> wrote: > Hi Christophe, > > You can set the parallelism of the FlinkKafkaConsumer independently of the > total number of Kafka partitions (across all subscribed streams, including > newly created streams that match a subscribed pattern). > > The consumer deterministically assigns each partition to a single consumer > subtask, in a round-robin fashion. > E.g. if the parallelism of your FlinkKafkaConsumer is 2, and there is 6 > partitions, each consumer subtask will be assigned 3 partitions. > > As for topic pattern subscription, FlinkKafkaConsumers starting from > version 1.4.0 support this feature. You can take a look at [1] on how to do > that. > > Hope this helps! > > Cheers, > Gordon > > [1] https://ci.apache.org/projects/flink/flink-docs- > release-1.4/dev/connectors/kafka.html#kafka-consumers- > topic-and-partition-discovery > > On 3 February 2018 at 6:53:47 PM, Christophe Jolif ([email protected]) > wrote: > > Hi, > > If I'm sourcing from a KafkaConsumer do I have to explicitly set the Flink > job parallelism to the number of partions or will it adjust automatically > accordingly? In other word if I don't call setParallelism will get 1 or the > number of partitions? > > The reason I'm asking is that I'm listening to a topic pattern not a > single topic and the number of actual topic (and so partitions) behind the > pattern can change so it is not possible to know ahead ot time how many > partitions I will get. > > Thanks! > -- > Christophe > >
