Hi Kent You can view checkpoint details via web UI to know how much checkpointed data uploaded for each operator, and you can compare the state size as time goes on to see whether they upload checkpointed data in stable range.
Best Yun Tang ________________________________ From: Kent Murra <ke...@remitly.com> Sent: Saturday, April 18, 2020 1:47 To: user@flink.apache.org <user@flink.apache.org> Subject: Checkpoint Space Usage Debugging I'm looking into a situation where our checkpoint sizes are automatically growing over time. I'm unable to pinpoint exactly why this is happening, and it would be great if there was a way to figure out how much checkpoint space is attributable to each operator so I can narrow it down. Are there any tools or methods for introspecting the checkpoint data so that I can determine where the space is going? The pipeline in question is consuming from Kinesis and batching up data using windows. I suspected that I was doing something wrong with windowing, but I'm emitting FIRE_AND_PURGE and also setting a max end timestamp. The Kinesis consumer is not emitting watermarks at the moment, but as far as I know thats not necessary for proper checkpointing (only exactly once behavior).