[ https://issues.apache.org/jira/browse/FLINK-27155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17529826#comment-17529826 ]
Roman Khachatryan commented on FLINK-27155: ------------------------------------------- >From my perspective, # by timer from the 1st access - can be unstable # reference count per file - without timer, can clear cache too early # switching all tasks of a job in a TM to RUNNING - same + requires binding of caching component with TaskExecutor # switching all tasks of a job in a all TM to RUNNING - requires new type of notifications from JM to TM, and probably still not enough with task-local recovery I'm leaning toward ref. counting + additional delay. > Reduce multiple reads to the same Changelog file in the same taskmanager > during restore > --------------------------------------------------------------------------------------- > > Key: FLINK-27155 > URL: https://issues.apache.org/jira/browse/FLINK-27155 > Project: Flink > Issue Type: Sub-task > Components: Runtime / State Backends > Reporter: Feifan Wang > Assignee: Feifan Wang > Priority: Major > Fix For: 1.16.0 > > > h3. Background > In the current implementation, State changes of different operators in the > same taskmanager may be written to the same changelog file, which effectively > reduces the number of files and requests to DFS. > But on the other hand, the current implementation also reads the same > changelog file multiple times on recovery. More specifically, the number of > times the same changelog file is accessed is related to the number of > ChangeSets contained in it. And since each read needs to skip the preceding > bytes, this network traffic is also wasted. > The result is a lot of unnecessary request to DFS when there are multiple > slots and keyed state in the same taskmanager. > h3. Proposal > We can reduce multiple reads to the same changelog file in the same > taskmanager during restore. > One possible approach is to read the changelog file all at once and cache it > in memory or local file for a period of time when reading the changelog file. > I think this could be a subtask of [v2 FLIP-158: Generalized incremental > checkpoints|https://issues.apache.org/jira/browse/FLINK-25842] . > Hi [~ym] , [~roman] how do you think about ? -- This message was sent by Atlassian Jira (v8.20.7#820007)