Streaming file sink will write to s3 when processing element. But it's just 
temporary file. Only after one successful checkpoint (more exactly, once 
recieve a notification for successful checkpoint), will it commit these 
temporary files written since last successful checkpoint . 

Best regards, 
Yuxia 


发件人: "Xin Ma" <kevin.xin...@gmail.com> 
收件人: "User" <user@flink.apache.org> 
发送时间: 星期四, 2022年 6 月 30日 下午 11:05:51 
主题: StreamingFileSink & checkpoint tuning 

Hi, 

I recently encountered an issue while using StreamingFileSink. 
I have a flink job consuming records from various sources and write to s3 with 
streaming file sink. But the job sometimes fails due to checkpoint timeout, and 
the root cause is checkpoint alignment failure as there is data skewness 
between different data sources. 

I don't want to enable unaligned checkpointing but prefer to do some checkpoint 
tuning first. 

My current checkpoint interval is 1 min and timeout is also 1 min. I wanna 
increase tolerable checkpoint failure number to 5, as I believe the unaligned 
subtasks will definitely update their watermark in 5 minutes. My question is, 
will streaming file sink still writes to s3 even if the checkpoint fails or 
just wait until next successful checkpoint? (as if we don't tolerate checkpoint 
failure, the job will simply restart from last successful checkpoint) 


Thanks. 

Best, 
Kevin 

Reply via email to