Re: flink checkpoints adjustment strategy

2021-01-29 Thread Marco Villalobos
Do you have advice on how to determine why a checkpoint failed? 1. Timeout (that's easy to discover as the UI logs them). 2. Other errors are not so easy to find. How can I find other errors? Are they in the UI, or good old-fashioned logging? On Fri, Jan 29, 2021 at 3:11 AM Congxian Qiu wrote:

Re: flink checkpoints adjustment strategy

2021-01-29 Thread Congxian Qiu
Hi Marco You need to figure out why the checkpoint timed out(you can see the consumed time of each period for one checkpoint in UI), if it indeed needs such long time to complete the checkpoint, then you need to configure a longer timeout. If there are some checkpoint errors, we need

flink checkpoints adjustment strategy

2021-01-28 Thread Marco Villalobos
I am kind of stuck in determining how large a checkpoint interval should be. Is there a guide for that? If a timeout time is 10 minutes, we time out, what is a good strategy for adjusting that? Where is a good starting point for a checkpoint? How shall they be adjusted? We often see checkpoint