Adamyuanyuan commented on PR #10139:
URL: https://github.com/apache/seatunnel/pull/10139#issuecomment-3611314895

   In Spark 2 e2e (`jdbc_mysql_source_and_sink_parallel.conf`), each 
`ParallelSource` instance owns its own `JdbcSourceSplitEnumerator`, but the 
splitter still creates “virtual” pending splits for all subtask ids. 
   The initial implementation of `maybeSignalNoMoreSplits` required both 
`pendingTables` and `pendingSplits` to be globally empty, so in the Spark 2 
case the condition was never met and `NoMoreSplitsEvent` was never sent, 
causing the batch job to hang until CI timeout. 
   The logic has been relaxed to only require that there are no pending splits 
for the **currently registered readers** (while all tables have been 
enumerated). This fixes the Spark 2 e2e timeout while preserving the intended 
failover semantics for Flink / SeaTunnel Engine.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to