Hi Kent

You can view checkpoint details via web UI to know how much checkpointed data 
uploaded for each operator, and you can compare the state size as time goes on 
to see whether they upload checkpointed data in stable range.

Best
Yun Tang
________________________________
From: Kent Murra <ke...@remitly.com>
Sent: Saturday, April 18, 2020 1:47
To: user@flink.apache.org <user@flink.apache.org>
Subject: Checkpoint Space Usage Debugging

I'm looking into a situation where our checkpoint sizes are automatically 
growing over time.  I'm unable to pinpoint exactly why this is happening, and 
it would be great if there was a way to figure out how much checkpoint space is 
attributable to each operator so I can narrow it down.  Are there any tools or 
methods for introspecting the checkpoint data so that I can determine where the 
space is going?

The pipeline in question is consuming from Kinesis and batching up data using 
windows.  I suspected that I was doing something wrong with windowing, but I'm 
emitting FIRE_AND_PURGE and also setting a max end timestamp.  The Kinesis 
consumer is not emitting watermarks at the moment, but as far as I know thats 
not necessary for proper checkpointing (only exactly once behavior).

Reply via email to