[ https://issues.apache.org/jira/browse/FLINK-29913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17725418#comment-17725418 ]
Roman Khachatryan commented on FLINK-29913: ------------------------------------------- {quote}On a 64-bit system, about 7.63MB more memory will be used for every one million entries Is there any other runtime overhead I missed ? {quote} I was more concerned about the additional time required to traverse the (mostly single-element) lists. When a checkpoint is subsumed, *all* entries need to be scanned. Adding pointer dereference(s) might break any optimizations that JVM and CPU would otherwise employ. {quote}As for the complexity, this approach will indeed increase the operation of the linked list in the registerReference() method and unregisterUnusedState() method. But given that this is easy to implement, and the implementation is cohesive, I think the complexity is acceptable. {quote} In my view, simplicity in this is part is worth the efforts. The problem that SharedStateRegistry solves is already tricky, and we shouldn't complicate it further (higher complexity potentially leads to more bugs and more maintanance efforts). {quote}Further, regarding the approach of using unique registry key, I agree with Congxian Qiu , we can just choose a stable register key generation method based on remote file name (such as use md5 digest of remote file name) , which can replace of IncrementalRemoteKeyedStateHandle#createSharedStateRegistryKeyFromFileName() . The mapping of local sst file name to StreamStateHandle never changed , so the part of RocksDB recovery does not need to be changed. {quote} I don't fully understand what does "never changed" means here. [Here|https://github.com/apache/flink/blob/fbf7b91424ec626ae56dd2477347a7759db6d5fe/flink-state-backends/flink-statebackend-rocksdb/src/main/java/org/apache/flink/contrib/streaming/state/RocksDBStateDownloader.java#L105], the ID is used to create local path. If we change ID to remote path, local will change too. Per my understanding, we can not change the local path without updating metadata files. Or am I missing something? {quote}Whichever approach will be chosen, I am happy to implement it. Can you assign this ticket to me Roman Khachatryan ? looking forward to hearing from you. {quote} Sure, thanks for volounteering! > Shared state would be discarded by mistake when maxConcurrentCheckpoint>1 > ------------------------------------------------------------------------- > > Key: FLINK-29913 > URL: https://issues.apache.org/jira/browse/FLINK-29913 > Project: Flink > Issue Type: Bug > Components: Runtime / Checkpointing > Affects Versions: 1.15.0, 1.16.0 > Reporter: Yanfei Lei > Priority: Minor > > When maxConcurrentCheckpoint>1, the shared state of Incremental rocksdb state > backend would be discarded by registering the same name handle. See > [https://github.com/apache/flink/pull/21050#discussion_r1011061072] > cc [~roman] -- This message was sent by Atlassian Jira (v8.20.10#820010)