[jira] [Commented] (FLINK-29913) Shared state would be discarded by mistake when maxConcurrentCheckpoint>1

Feifan Wang (Jira) Tue, 23 May 2023 05:22:25 -0700


    [ 
https://issues.apache.org/jira/browse/FLINK-29913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17725387#comment-17725387
 ]


Feifan Wang commented on FLINK-29913:
-------------------------------------

One overhead I can see is that it will use more memory for storing the next 
pointer. On a 64-bit system, about 7.63MB more memory will be used for every 
one million entries, I think it is acceptable. Is there any other runtime 
overhead I missed ?

As for the complexity, this approach will indeed increase the operation of the 
linked list in the _registerReference()_ method and _unregisterUnusedState()_ 
method. But given that this is easy to implement, and the implementation is 
cohesive, I think the complexity is acceptable.

 

Just to clarify, I think using a unique ID is also a valid approach, but I want 
learn how you do the selection. Further, regarding the approach of using unique 
registry key, I agree with [~klion26] , we can just choose a stable register 
key generation method based on remote file name (such as use md5 digest of 
remote file name) , which can replace of 
IncrementalRemoteKeyedStateHandle#createSharedStateRegistryKeyFromFileName() . 
The mapping of local sst file name to StreamStateHandle never changed , so the 
part of RocksDB recovery does not need to be changed.

 

Whichever approach will be chosen, I am happy to implement it. Can you assign 
this ticket to me  [~roman] ? looking forward to hearing from you.

> Shared state would be discarded by mistake when maxConcurrentCheckpoint>1
> -------------------------------------------------------------------------
>
>                 Key: FLINK-29913
>                 URL: https://issues.apache.org/jira/browse/FLINK-29913
>             Project: Flink
>          Issue Type: Bug
>          Components: Runtime / Checkpointing
>    Affects Versions: 1.15.0, 1.16.0
>            Reporter: Yanfei Lei
>            Priority: Minor
>
> When maxConcurrentCheckpoint>1, the shared state of Incremental rocksdb state 
> backend would be discarded by registering the same name handle. See 
> [https://github.com/apache/flink/pull/21050#discussion_r1011061072]
> cc [~roman] 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-29913) Shared state would be discarded by mistake when maxConcurrentCheckpoint>1

Reply via email to