Hi Vishal,

I am not sure I get what you mean by the question #2:
>2. SST files get created each time a checkpoint is triggered. At this
point, does the data for a given key get merged in case the initial data
was read from an SST file while the update must have happened in memory?

Could you maybe rephrase it?

Upon the arrival of a checkpoint barrier, Flink flushes all memtables to
SST files synchronously (so any state updates are shortly blocked).
Compaction runs in background thread(s) and Flink deals with the outcome of
RocksDB compaction post factum. For non-incremental checkpoints, everything
is straightforward - all SST files are copied over. The interplay of the
incremental checkpoints and compaction is described in this [1] blog post.

[1]
https://flink.apache.org/features/2018/01/30/incremental-checkpointing.html

Best,
Alexander Fedulov

On Mon, Jul 4, 2022 at 4:25 PM Vishal Surana <vis...@moengage.com> wrote:

> In my load tests, I've found FIFO compaction to offer the best performance
> as my job needs state only for so long. However, this particular statement
> in RocksDB documentation concerns me:
>
> "Since we never rewrite the key-value pair, we also don't ever apply the
> compaction filter on the keys."
>
> This is from -
> https://github.com/facebook/rocksdb/wiki/FIFO-compaction-style
>
> I've observed that SST files are getting compacted into larger SST files
> until a configured threshold is reached. Thus I'm not sure what's going on
> anymore.
>
> My questions at this stage are:
>
>    1. If there's a value that I get from RocksDB and I decide to update
>    this value back then will it work with FIFO compaction?
>    2. SST files get created each time a checkpoint is triggered. At
>    this point, does the data for a given key get merged in case the initial
>    data was read from an SST file while the update must have happened in
>    memory?
>    3. If the answer to above is yes, then I suppose I can use FIFO
>    compaction. However, then my question is whether the RocksDB documentation
>    is wrong or whether Flink is doing something in addition to what RocksDB
>    does.
>
> Thanks!
>

Reply via email to