isunjin commented on issue #6684: [FLINK-10205] Batch Job: InputSplit Fault tolerant for DataSource… URL: https://github.com/apache/flink/pull/6684#issuecomment-431950619 @tillrohrmann >The thing I'm questioning is whether the InputSplits of the failed task need to be processed by the same (restarted) task or can be given to any running task. Agree. I think failed task **doesn't** very necessary need to be processed by the same task (executionvertex). > So far I'm not convinced that something would break if we simply return the InputSplits to the InputSplitAssigner Agree. i think ```simply return the InputSplits to the InputSplitAssigner``` would work, the point is how to make it work. Restart the entier graph will call ExecutionJobVertex.resetForNewExecution which will create a new ```InputSplitAssigner``` and "return" all ```InputSplits``` to ``` InputSplitAssigner```. My point is that for fine-grian failover, we might not want to return all ```InputSplits``` but just the failed ```InputSplits```. However, currently not all subclass of InputSplitAssigner has the logic to ```simply return the InputSplits to the InputSplitAssigner```, such as ```LocatableInputSplitAssigner``` or any other ```customized InputSplitAssigner```. ```simply return the InputSplits to the InputSplitAssigner``` also implies transaction between task and jobManager (maybe multiple one), we need to make sure the ```inputSplits``` get return to the ```InputSplitAssigner``` exactly once. what happened if we have speculative execution, which means two task consume the same set of InputSplits and but not fail at same time, does every InputSplitAssigner need to keep a list to deduplicate? what happened if the TM died or has network issue and InputSplit cannot be return? Save the ```InputSplits``` in executionVertex is a way to "return" it to ``` InputSplitAssigner```, the "side effect" of this implementation is that this also implies the ``` InputSplits``` will be handled by the same task (executionVertex). But this seams a simple and safe way to implement ```simply return the InputSplits to the InputSplitAssigner``` with transaction. @tillrohrmann, the above is my understanding, let you know if we are on the same page. I would happy to redo this if you have any other suggestion.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services