RocksDB does do compaction in the background, and incremental checkpoints
simply mirror to S3 the set of RocksDB SST files needed by the current set
of checkpoints.

However, unlike checkpoints, which can be incremental, savepoints are
always full snapshots. As for why one host would have much more state than
the others, perhaps you have significant key skew, and one task manager is
ending up with more than its share of state to manage.

Best,
David

On Sat, Dec 12, 2020 at 12:31 AM Rex Fenley <r...@remind101.com> wrote:

> Hi,
>
> We're using the Rocks state backend with incremental checkpoints and
> savepoints setup for S3. We notice that every time we trigger a savepoint,
> one of the local disks on our host explodes in disk usage.
> What is it that savepoints are doing which would cause so much disk to be
> used?
> Our checkpoints are a few GiB in size, is the savepoint combining all the
> checkpoints together at once on disk?
> I figured that incremental checkpoints would compact over time in the
> background, is that correct?
>
> Thanks
>
> Graph here. Parallelism is 1 and volume size is 256 GiB.
> [image: Screen Shot 2020-12-11 at 2.59.59 PM.png]
>
>
> --
>
> Rex Fenley  |  Software Engineer - Mobile and Backend
>
>
> Remind.com <https://www.remind.com/> |  BLOG <http://blog.remind.com/>  |
>  FOLLOW US <https://twitter.com/remindhq>  |  LIKE US
> <https://www.facebook.com/remindhq>
>

Reply via email to