[ 
https://issues.apache.org/jira/browse/FLINK-26726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17509555#comment-17509555
 ] 

zoucao commented on FLINK-26726:
--------------------------------

Gentle ping [~lzljs3620320], could you help me to confirm it? The exception 
message exists in the attachment, and plz correct me in time if i missed 
something.

> Remove the unregistered  task from readersAwaitingSplit
> -------------------------------------------------------
>
>                 Key: FLINK-26726
>                 URL: https://issues.apache.org/jira/browse/FLINK-26726
>             Project: Flink
>          Issue Type: Improvement
>          Components: Table SQL / Ecosystem
>            Reporter: zoucao
>            Priority: Major
>         Attachments: stack.txt
>
>
> Recently, we faced a problem caused by the unregistered task when using the 
> hive table as a source to do streaming reading. 
> I think the problem is that we do not remove the unregistered  task from 
> `readersAwaitingSplit` in `ContinuousHiveSplitEnumerator` and 
> `ContinuousFileSplitEnumerator`.
> Assuming that we have two tasks 0 and 1, they all exist in 
> `readersAwaitingSplit`,  if there does not exist any new file in the path for 
> a long time. Then, a new split is generated, and it is assigned to task-1. 
> Unfortunately, task-1 can not consume the split successfully, and the 
> exception will be thrown and cause all tasks to restart. The failover will 
> not affect the `readersAwaitingSplit`, but it will clear the 
> `SourceCoordinatorContext#registeredReaders`.
> After restarting, task-0 exists in `readersAwaitingSplit` but not in 
> `registeredReaders`. if task-1 register first and send the request to get 
> split, the SplitEnumerator will assign splits for both task-1 and task-0, but 
> task-0 has not been registered.
> The stack exists in the attachment.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to