Re: FlinkKafkaConsumer configuration to consume from Multiple Kafka Topics

Andrey Zagrebin Wed, 18 Jul 2018 04:55:58 -0700

Hi Sagar,

At the moment number of partitions in Kafka source topics and parallelism of 
Flink Kafka source operator are completely independent. Flink will internally 
distribute partitions between a number of source parallel subtasks which you 
configure. In case of dynamic partition or topic discovery while running it 
also happens automatically.


Job or source parallelism can be set e.g. to the total number of Kafka 
partitions over all topics known in advanced, if programmatically then e.g. 
using Kafka client.

Cheers,
Andrey

> On 18 Jul 2018, at 07:54, sagar loke <sagar...@gmail.com> wrote:
> 
> Hi,
> 
> We have a use case where we are consuming from more than 100s of Kafka 
> Topics. Each topic has different number of partitions. 
> 
> As per the documentation, to parallelize a Kafka Topic, we need to use 
> setParallelism() == number of Kafka Partitions for a topic. 
> 
> But if we are consuming multiple topics in Flink by providing pattern eg. 
> my_topic_* and for each topic if there is different configuration for 
> partitions, 
> 
> then how should we connect all these together so that we can map Kafka 
> Partition to Flink Parallelization correctly and programmatically (so that we 
> don't have to hard code all the topic names and parallelism -- considering we 
> can access kafka topic <-> number of partitions mapping in Flink) ?
> 
> Thanks,

Re: FlinkKafkaConsumer configuration to consume from Multiple Kafka Topics

Reply via email to