[ 
https://issues.apache.org/jira/browse/FLINK-29913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17725418#comment-17725418
 ] 

Roman Khachatryan commented on FLINK-29913:
-------------------------------------------

{quote}On a 64-bit system, about 7.63MB more memory will be used for every one 
million entries
Is there any other runtime overhead I missed ?
{quote}
I was more concerned about the additional time required to traverse the (mostly 
single-element) lists. When a checkpoint is subsumed, *all* entries need to be 
scanned. Adding pointer dereference(s) might break any optimizations that JVM 
and CPU would otherwise employ.

 
{quote}As for the complexity, this approach will indeed increase the operation 
of the linked list in the registerReference() method and 
unregisterUnusedState() method.
But given that this is easy to implement, and the implementation is cohesive, I 
think the complexity is acceptable.
{quote}
In my view, simplicity in this is part is worth the efforts. The problem that 
SharedStateRegistry solves is already tricky, and we shouldn't complicate it 
further (higher complexity potentially leads to more bugs and more maintanance 
efforts). 

 
{quote}Further, regarding the approach of using unique registry key, I agree 
with Congxian Qiu , we can just choose a stable register key generation method 
based on remote file name (such as use md5 digest of remote file name) , which 
can replace of 
IncrementalRemoteKeyedStateHandle#createSharedStateRegistryKeyFromFileName() .
The mapping of local sst file name to StreamStateHandle never changed , so the 
part of RocksDB recovery does not need to be changed.
{quote}
I don't fully understand what does "never changed" means here.
[Here|https://github.com/apache/flink/blob/fbf7b91424ec626ae56dd2477347a7759db6d5fe/flink-state-backends/flink-statebackend-rocksdb/src/main/java/org/apache/flink/contrib/streaming/state/RocksDBStateDownloader.java#L105],
 the ID is used to create local path. If we change ID to remote path, local 
will change too. Per my understanding, we can not change the local path without 
updating metadata files.
Or am I missing something?

 
{quote}Whichever approach will be chosen, I am happy to implement it. Can you 
assign this ticket to me Roman Khachatryan ? looking forward to hearing from 
you.
{quote}
Sure, thanks for volounteering!

> Shared state would be discarded by mistake when maxConcurrentCheckpoint>1
> -------------------------------------------------------------------------
>
>                 Key: FLINK-29913
>                 URL: https://issues.apache.org/jira/browse/FLINK-29913
>             Project: Flink
>          Issue Type: Bug
>          Components: Runtime / Checkpointing
>    Affects Versions: 1.15.0, 1.16.0
>            Reporter: Yanfei Lei
>            Priority: Minor
>
> When maxConcurrentCheckpoint>1, the shared state of Incremental rocksdb state 
> backend would be discarded by registering the same name handle. See 
> [https://github.com/apache/flink/pull/21050#discussion_r1011061072]
> cc [~roman] 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to