Hi Chen

>From your description, I think you called keyedState.clear() to clear up the 
>key which has not been seen for several minutes.

  *   For HeapKeyedStateBackend, it will just remove the related content from 
memory immediately, no worry about the increasing checkpoint size.
  *   For RocksDBKeyedStateBackend, it will record delete operation for the key 
bytes in the DB, but the actual 'remove' (not occupying any space for the 
to-delete-key) would happen when compaction executed generally. In other words, 
if you called keyedState.clear() to clean up current key related bytes, you 
might not expect the checkpoint size decreased immediately but it eventually 
decreases as rocksDB always running compaction. If you still worry about this, 
consider to increase the background compaction threads for RocksDB by calling 
DBOptions.setMaxBackgroundCompactions or DBOptions.setIncreaseParallelism .

Best,
Yun
________________________________
From: burgesschen <tchen...@bloomberg.net>
Sent: Monday, July 16, 2018 23:57
To: user@flink.apache.org
Subject: Ever increasing key space

Hi every one,

We are building a flink job that keys on a dynamic value. Only a few events
share the same key and events with new keys are consumed constantly.

For each key, there are some keyedState created the first time it is seen.
And we clean up the keyedState if the key has not been seen for X minutes
using a timer.

My question is:
If the key space is ever increasing? Does it result in an ever increasing
checkpoint size even I clean up the keyedState?

Thank you!


Best,
-Chen



--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Reply via email to