[ 
https://issues.apache.org/jira/browse/FLINK-22684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17346027#comment-17346027
 ] 

Anton Kalashnikov edited comment on FLINK-22684 at 5/17/21, 9:54 AM:
---------------------------------------------------------------------

[~pnowojski] , [~roman_khachatryan] A couple of question:
 * Should the new parameter be part of CheckpointConfig or SavepointConfig? 
According to the initial problem, it should be SavepointConfig but I see here 
naming problem. I mean Savepoint in fact doesn't contain any in-flight data and 
it will be strange if SavepointConfig has ignoreInFlightData property.
 * Should this new property be something more complicated than just a boolean? 
For example, it can be some complex property that allows ignoring in-flight 
data only for specific operator/subtask. but initially, we can implement only 
two options NONE or ALL.
 * How expensive is it in general to load metadata of the in-flight data? I 
mean, initially, I thought it would make sense to load all the metadata as 
usual and then, inside the CheckpointCoordinator, do some transformations as 
needed. But now I think it might be expensive and it might be better to move 
this logic deeper and not even load it from the storage.
 * Should we provide the ability to ignore data during the restart of the job 
after fail? Or should it be only possible during the manual start with the 
recovery directory?


was (Author: akalashnikov):
[~pnowojski] , [~roman_khachatryan] A couple of question:
 * Should the new parameter be part of CheckpointConfig or SavepointConfig? 
According to the initial problem, it should be SavepointConfig but I see here 
naming problem. I mean Savepoint in fact doesn't contain any in-flight data and 
it will be strange if SavepointConfig has ignoreInFlightData property.
 * Should this new property be something more complicated than just a boolean? 
For example, it can be some complex property that allows ignoring in-flight 
data only for specific operator/subtask. but initially, we can implement only 
two options NONE or ALL.
 * How expensive is it in general to load metadata of the in-flight data? I 
mean, initially, I thought it would make sense to load all the metadata as 
usual and then, inside the CheckpointCoordinator, do some transformations as 
needed. But now I think it might be expensive and it might be better to move 
this logic deeper and not even load it from the storage.

> Add the ability to ignore in-flight data on recovery
> ----------------------------------------------------
>
>                 Key: FLINK-22684
>                 URL: https://issues.apache.org/jira/browse/FLINK-22684
>             Project: Flink
>          Issue Type: Improvement
>            Reporter: Anton Kalashnikov
>            Priority: Major
>
> The main case:
>  * We want to restore the last unaligned checkpoint.
>  * In-flight data of this checkpoint is corrupted.
>  * We want to ignore this corrupted data and restore only states.
> The idea is having new configuration parameter('ignoreInFlightDataOnRecovery' 
> or similar). and If it set to true, ignore the metadata of in-flight data on 
> the Checkpoint Coordinator side.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to