Streaming file sink will write to s3 when processing element. But it's just temporary file. Only after one successful checkpoint (more exactly, once recieve a notification for successful checkpoint), will it commit these temporary files written since last successful checkpoint .
Best regards, Yuxia 发件人: "Xin Ma" <kevin.xin...@gmail.com> 收件人: "User" <user@flink.apache.org> 发送时间: 星期四, 2022年 6 月 30日 下午 11:05:51 主题: StreamingFileSink & checkpoint tuning Hi, I recently encountered an issue while using StreamingFileSink. I have a flink job consuming records from various sources and write to s3 with streaming file sink. But the job sometimes fails due to checkpoint timeout, and the root cause is checkpoint alignment failure as there is data skewness between different data sources. I don't want to enable unaligned checkpointing but prefer to do some checkpoint tuning first. My current checkpoint interval is 1 min and timeout is also 1 min. I wanna increase tolerable checkpoint failure number to 5, as I believe the unaligned subtasks will definitely update their watermark in 5 minutes. My question is, will streaming file sink still writes to s3 even if the checkpoint fails or just wait until next successful checkpoint? (as if we don't tolerate checkpoint failure, the job will simply restart from last successful checkpoint) Thanks. Best, Kevin