[ 
https://issues.apache.org/jira/browse/FLINK-19934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17225572#comment-17225572
 ] 

Steven Zhen Wu commented on FLINK-19934:
----------------------------------------

[~sewen] we thought about that. the difference is that the no-op callable will 
be executed in the IO worker thread pool first, where threads may be tied up 
for a significantly long time (e.g. split planning can take dozens of seconds). 
Then the handler function got executed in the coordinator pool. The potential 
long delay in the IO thread pool step is what we are trying to avoid

> [FLIP-27 source] add new API: SplitEnumeratorContext.execute(Runnable)
> ----------------------------------------------------------------------
>
>                 Key: FLINK-19934
>                 URL: https://issues.apache.org/jira/browse/FLINK-19934
>             Project: Flink
>          Issue Type: New Feature
>          Components: API / DataStream
>    Affects Versions: 1.11.2
>            Reporter: Steven Zhen Wu
>            Priority: Major
>
> Here is the motivation use case. We are implementing event-time alignment 
> across sources in Iceberg source. Basically, each Iceberg source/enumerator 
> tracks its watermark using min/max timestamps captures in the column stats of 
> the data files.
> When the watermark from another source advances, notified source/enumerator 
> can try `assignSplits` as constraints may be satisfied now. This callback is 
> initiated from the coordinator thread from the other source. If we have 
> `SplitEnumeratorContext.execute(Runnable r)` API, we can ensure that all the 
> actions by enumerator and assigner are serialized by the coordinator thread. 
> That can avoid the need of locks.
> [~becket_qin] [~sewen] what do you think? cc [~sundaram]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to