[ 
https://issues.apache.org/jira/browse/FLINK-9070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16415609#comment-16415609
 ] 

Truong Duc Kien commented on FLINK-9070:
----------------------------------------

So here's my idea for RocksDBMapState.clear() with DeleteRange.

[https://github.com/dikei/flink/commit/dc887af5b48e8f9d2f1e1f5b612efd2ebae5a801]

It bases on the assumtion that for all key Ki with prefix P we have P < Ki < Q, 
where Q is a byte array with the same length as P, and there's no byte array 
with the same length between them lexicographically. Therefore, we can safely 
call deleteRange(P, Q) to delete all Ki and not touching anything else.

> Improve performance of RocksDBMapState.clear()
> ----------------------------------------------
>
>                 Key: FLINK-9070
>                 URL: https://issues.apache.org/jira/browse/FLINK-9070
>             Project: Flink
>          Issue Type: Improvement
>          Components: State Backends, Checkpointing
>    Affects Versions: 1.6.0
>            Reporter: Truong Duc Kien
>            Priority: Minor
>
> Currently, RocksDBMapState.clear() is implemented by iterating over all the 
> keys and drop them one by one. This iteration can be quite slow with: 
>  * Large maps
>  * High-churn maps with a lot of tombstones
> There are a few methods to speed-up deletion for a range of keys, each with 
> their own caveats:
>  * DeleteRange: still experimental, likely buggy
>  * DeleteFilesInRange + CompactRange: only good for large ranges
>  
> Flink can also keep a list of inserted keys in-memory, then directly delete 
> them without having to iterate over the Rocksdb database again. 
>  
> Reference:
>  * [RocksDB article about range 
> deletion|https://github.com/facebook/rocksdb/wiki/Delete-A-Range-Of-Keys]
>  * [Bug in DeleteRange|https://pingcap.com/blog/2017-09-08-rocksdbbug]
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to