zoucao created FLINK-26726:
------------------------------

             Summary: Remove the unregistered  task from readersAwaitingSplit
                 Key: FLINK-26726
                 URL: https://issues.apache.org/jira/browse/FLINK-26726
             Project: Flink
          Issue Type: Improvement
          Components: Table SQL / Ecosystem
            Reporter: zoucao
         Attachments: stack.txt

Recently, we faced a problem caused by the unregistered task when using the 
hive table as a source to do streaming reading. 
I think the problem is that we do not remove the unregistered  task from 
`readersAwaitingSplit` in `ContinuousHiveSplitEnumerator` and 
`ContinuousFileSplitEnumerator`.

Assuming that we have two tasks 0 and 1, they all exist in 
`readersAwaitingSplit`,  if there does not exist any new file in the path for a 
long time. Then, a new split is generated, and it is assigned to task-1. 
Unfortunately, task-1 can not consume the split successfully, and the exception 
will be thrown and cause all tasks to restart. The failover will not affect the 
`readersAwaitingSplit`, but it will clear the 
`SourceCoordinatorContext#registeredReaders`.
After restarting, task-0 exists in `readersAwaitingSplit` but not in 
`registeredReaders`. if task-1 register first and send the request to get 
split, the SplitEnumerator will assign splits for both task-1 and task-0, but 
task-0 has not been registered.


The stack exists in the attachment.








--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to