Qingsheng Ren created FLINK-22147:
-------------------------------------

             Summary: Refactor Partition Discovery Logic in 
KafkaSourceEnumerator
                 Key: FLINK-22147
                 URL: https://issues.apache.org/jira/browse/FLINK-22147
             Project: Flink
          Issue Type: Improvement
          Components: Connectors / Kafka
    Affects Versions: 1.13.0
            Reporter: Qingsheng Ren


Currently the logic of partition discovery is: the worker thread checks if 
there's new partitions and initialize new splits if so, then coordinator thread 
marks these splits as pending and try to make assignments.

Under current design, the worker thread needs to keep an internal data 
structure tracking already discovered partitions, which is duplicated with 
pending splits + assigned partitions tracked by coordinator thread. Usually 
this kind of double-bookkeeping is fragile. 

Another issue is that the worker thread always fetches descriptions of ALL 
topics at partition discovery, which will comes to a problem working with a 
giant Kafka clusters with millions of topics/partitions. 

In order to fix issues above, a refactor is needed for the partition discovery 
logic in Kafka enumerator. Basically the logic can be changed to:
 # The worker thread fetches descriptions of subscribed topics/partitions, then 
hands over to coordinator thread
 # The coordinator thread filters out already discovered partitions (pending + 
assigned partitions), then invokes worker thread with {{callAsync}} to fetch 
offsets for new partitions
 #  The worker thread fetches offsets and creates splits for new partitions, 
then hands over new splits to coordinator thread
 # The coordinator thread marks these splits as pending and try to make 
assignment. 

Discussion of this issue can be found in 
[https://github.com/apache/flink/pull/15461] .



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to