[ 
https://issues.apache.org/jira/browse/FLINK-25458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roman Khachatryan updated FLINK-25458:
--------------------------------------
        Parent: FLINK-25842
    Issue Type: Sub-task  (was: Technical Debt)

> Support local recovery
> ----------------------
>
>                 Key: FLINK-25458
>                 URL: https://issues.apache.org/jira/browse/FLINK-25458
>             Project: Flink
>          Issue Type: Sub-task
>          Components: Runtime / Checkpointing, Runtime / State Backends
>            Reporter: Yun Tang
>            Priority: Major
>
> Currently, changelog state-backend doesn't support local recovery. Thus, 
> recovery times might be sub-optimal.
>  
> Materialized state issues:
> Current periodic materialization would call state backend snapshot method 
> with a materialization id. However, current local state managment would rely 
> on checkpoint id as storing, confirming and discarding. The gap between them 
> would break how local recovery works.
>  
> Non-materialized state issues:
>  * non-materialized state (i.e. changelog) is shared across checkpoints, and 
> therefore needs some tracking (in TM or hard-linking in FS)
>  * the writer does not enforce boundary between checkpoints (when writing to 
> DFS); if local stream simply duplicates DFS stream then it would break on 
> cleanup
>  * files can be shared across tasks, which will also break on cleanup



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to