[ 
https://issues.apache.org/jira/browse/FLINK-11116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17328505#comment-17328505
 ] 

Flink Jira Bot commented on FLINK-11116:
----------------------------------------

This major issue is unassigned and itself and all of its Sub-Tasks have not 
been updated for 30 days. So, it has been labeled "stale-major". If this ticket 
is indeed "major", please either assign yourself or give an update. Afterwards, 
please remove the label. In 7 days the issue will be deprioritized.

> Overwrite outdated in-progress files in StreamingFileSink.
> ----------------------------------------------------------
>
>                 Key: FLINK-11116
>                 URL: https://issues.apache.org/jira/browse/FLINK-11116
>             Project: Flink
>          Issue Type: Improvement
>          Components: Connectors / FileSystem
>    Affects Versions: 1.7.0
>            Reporter: Kostas Kloudas
>            Priority: Major
>              Labels: pull-request-available, stale-major
>             Fix For: 1.7.3
>
>          Time Spent: 20m
>  Remaining Estimate: 0h
>
> In order to guarantee exactly-once semantics, the streaming file sink is 
> implementing a two-phase commit protocol when writing files to the filesystem.
> Initially data is written to in-progress files. These files are then put into 
> "pending" state when they are completed (based on the rolling policy), and 
> they are finally committed when the checkpoint that put them in the "pending" 
> state is acknowledged as complete.
> The above shows that in the case that we have:
> 1) checkpoints A, B, C coming 
> 2) checkpoint A being acknowledged and 
> 3) failure
> Then we may have files that do not belong to any checkpoint (because B and C 
> were not considered successful). These files are currently not cleaned up.
> In order to reduce the amount of such files created, we removed the random 
> suffix from in-progress temporary files, so that the next in-progress file 
> that is opened for this part, overwrites them.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to