Github user aljoscha commented on the pull request:

    https://github.com/apache/flink/pull/1084#issuecomment-137125224
  
    If it fails in the middle of writing or before sync/flush is called on the 
writer then the data can be in an inconsistent state. I see three ways of 
dealing with this, one is more long-term.
    
    The long term solution is to make the sink exactly-once aware. Either using 
truncate() support in Hadoop 2.7 or a custom Thread that does merging of part 
files and throwing away of data that was erroneously written.
    
    The two short term options are:
     - Keep it as it is, consumers need to be able to deal with corrupt records 
and ignore them. This would give you at-least-once semantics.
     - Write to a temporary file. When rolling, close the current bucket and 
rename the file to the final filename. This would ensure that the output 
doesn't contain corrupt records but you would have neither at-least-once nor 
exactly-once semantics because some written records would be lost if checkpoint 
restore restores to a state after the writing of the current bucket file 
started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

Reply via email to