Yes. I do use RocksDB for (incremental) checkpointing. During each checkpoint 
15-20GB of state gets created (new state added, some expired). I make use of 
FIFO compaction.

I’m a bit surprised you were able to run with 10+TB state without unaligned 
checkpoints because the performance in my application degrades quite a lot. Can 
you share your checkpoint configurations?


Thanks,
Vishal
On 15 Nov 2022, 10:07 PM +0530, Yaroslav Tkachenko <yaros...@goldsky.com>, 
wrote:
> Hi Vishal,
>
> Just wanted to comment on this bit:
>
> > My job has very large amount of state (>100GB) and I have no option but to 
> > use unaligned checkpoints.
>
> I successfully ran Flink jobs with 10+ TB of state and no unaligned 
> checkpoints enabled. Usually, you consider enabling them when there is some 
> kind of skew in the topology, but IMO it's unrelated to the state size.
>
> > Reducing the checkpoint interval is not really an option given the size of 
> > the checkpoint
>
> Do you use RocksDB state backend with incremental checkpointing?
>
> > On Tue, Nov 15, 2022 at 12:07 AM Vishal Surana <vis...@moengage.com> wrote:
> > > I wanted to achieve exactly once semantics in my job and wanted to make 
> > > sure I understood the current behaviour correctly:
> > >
> > > 1. Only one Kafka transaction at a time (no concurrent checkpoints)
> > > 2. Only one transaction per checkpoint
> > >
> > >
> > > My job has very large amount of state (>100GB) and I have no option but 
> > > to use unaligned checkpoints. With the above limitation, it seems to me 
> > > that if checkpoint interval is 1 minute and checkpoint takes about 10 
> > > seconds to complete then only one Kafka transaction can happen in 70 
> > > seconds. All of the output records will not be visible until the 
> > > transaction completes. This way a steady stream of inputs will result in 
> > > an buffered output stream where data is only visible after a minute, 
> > > thereby destroying any sort of real time streaming use cases. Reducing 
> > > the checkpoint interval is not really an option given the size of the 
> > > checkpoint. Only way out would be to allow for multiple transactions per 
> > > checkpoint.
> > >
> > > Thanks,
> > > Vishal

Reply via email to