Hi, Hongshun. It's a good question.

I also noticed this problem before.

Because the behavior of removedPartitions is different from that of
newPartitions, even if removedPartitions is detected or notified to the
task side,
the task cannot be deleted directly and still needs to be read to prevent
data loss.
The high probability of the #TODO here is that there is currently no
elegant way to deal with removedPartitions.

In fact, there are some other problems here, because removedPartitions is
not cleaned up in the memory state of KafkaSourceEnumerator,
so the redundant log of removedPartitions will be printed every time of
getPartitionChange.

You can try to discuss and fix this problem if you are interested.

Best Regards,
Ran Tao


Hongshun Wang <loserwang1...@gmail.com> 于2023年5月10日周三 02:34写道:

> Hi Devs,
>
> There are some to-do comments and variables related to the removed
> partitions handle (such as PartitionSplitChange#removedPartitions) in
> KafkaSourceEnumerator that have been around for years. I am wondering
> whether  we still need to implement it or  no longer necessary to handle
> the removed partitions? If no longer necessary, why not remove it?
>
> Best,
>
> Hongshun
>

Reply via email to