Re: Re: FLIP-413: Enable unaligned checkpoints by default

2024-01-09 Thread Piotr Nowojski
Hi! I think this warning from the documentation is a bit over the top. Yes, unaligned checkpoints in that regard are adding an extra source of indeterminism, however please note that Flink doesn't give any guarantees that the results will be the same after a recovery, as the order of the records c

Re: Re: FLIP-413: Enable unaligned checkpoints by default

2024-01-08 Thread Mason Chen
Hi Piotr, I also agree with Zhanghao's assessment on the limitations of unaligned checkpoints. Some of them are already handled properly by Flink, but in the case of the "Interplay with watermarks" limitation, it is quite confusing for a new user to find that their code doesn't generate consistent

Re: FLIP-413: Enable unaligned checkpoints by default

2024-01-08 Thread Piotr Nowojski
Hi thanks for the responses, And thanks for pointing out the jobs upgrade issue. Indeed that has slipped my mind. I was mistakenly thinking that we are supporting all upgrades only via savepoint. Anyway, maybe in that case we should guide users towards that? Using savepoints for upgrades? That wou

Re: FLIP-413: Enable unaligned checkpoints by default

2024-01-08 Thread Zakelly Lan
Hi Piotr, Thanks for driving this! Generally I support enabling the alignment timeout for aligned checkpoint. And I second Rui's opinion, 30s seems a reasonable value. However I'm worried if there are some operators that do not support the unaligned CP, which may cause data accuracy problems (as

Re: FLIP-413: Enable unaligned checkpoints by default

2024-01-07 Thread Rui Fan
Thanks to Piotr driving this proposal! Enabling unaligned checkpoint with aligned checkpoints timeout is fine for me. I'm not sure if aligned checkpoints timeout =5s is too aggressive. If the unaligned checkpoint is enabled by default for all jobs, I recommend that the aligned checkpoints timeout

Re: FLIP-413: Enable unaligned checkpoints by default

2024-01-07 Thread Zhanghao Chen
Hi Piotr, As a platform administer who runs kilos of Flink jobs, I'd be against the idea to enable unaligned cp by default for our jobs. It may help a significant portion of the users, but the subtle issues around unaligned CP for a few jobs will probably raise a lot more on-calls and incidents