isunjin commented on issue #6684:     [FLINK-10205] Batch Job: InputSplit Fault 
tolerant for DataSource…
URL: https://github.com/apache/flink/pull/6684#issuecomment-427127694
 
 
   @StefanRRichter thanks for comments
   - for the inconsistent issue, 
[this](https://github.com/isunjin/flink/commit/b61b58d963ea11d34e2eb7ec6f4fe4bfed4dca4a)
 is the repro, the logic is simple, we throw a exception in the wordcount 
example and use restartRegion as the failover strategy, the job was expected to 
fail, but succeed with incorrect result. the reason is that while restart, it 
will call requestNextSplit, it will return empty as the the split was drained 
to empty, since its empty, flatMap method will not get executed and exception 
will not throw.
   
   - the goal for the general approach is to make sure we have the assumption 
"deterministic behavior" as much as possible, as deterministic is crucial for 
failover. the code is not target for introduce "deterministic" for 
DataSourceTask, right now DataSourceTask is only used for batch scenario . For 
streaming scenario, it will work once we treat the splitIndex as state.
   
   - for the load balance, i think the first priority is make data consistent, 
we can certainly add more logic to make it more efficient.   
   
   - Thanks for let me know this, however, this is a bug right now, actually 
block me moving forward, we can refactor this code if we have a fundamental 
different design. 
   
   
    

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

Reply via email to