Hi Sriram, A short answer: the interval of polling is adjusted “dynamically” (by blocking the KafkaConsumer#poll call) according to the traffic of data.
I think this line [1] is what you are looking for. Basically KafkaSource fires KafkaPartitionSplitReader.fetch calls repeatedly in a loop, and each call invokes KafkaConsumer.poll(). If the data traffic is quite high on Kafka, the poll request will be returned immediately with new data, so the interval of polls approximately equals to the latency of poll request. If the traffic is relatively low, the poll request will be blocked until new data is available on Kafka or the request times out (which is set to 10 seconds in KafkaSource). Hope this could be helpful! [1] https://github.com/apache/flink/blob/d1cb177c91b41a5387814ad60d1799c08caf3ad9/flink-connectors/flink-connector-kafka/src/main/java/org/apache/flink/connector/kafka/source/reader/KafkaPartitionSplitReader.java#L101 Best, Qingsheng On Oct 8, 2022, 02:44 +0800, Sriram Ganesh <[email protected]>, wrote: > Hi Everyone, > > I am trying to understand how Flink works in realtime with Kafka. Since > Kafka works on polling, what will be the minimal time for Flink to poll > Kafka?. > > Any explanation or documentation will be helpful. > > Thanks, > Sriram G
