[jira] [Updated] (FLINK-23553) Trigger global failover for synchronous savepoints

Dawid Wysakowicz (Jira) Mon, 02 Aug 2021 08:01:40 -0700


     [ 
https://issues.apache.org/jira/browse/FLINK-23553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Dawid Wysakowicz updated FLINK-23553:
-------------------------------------
    Description: 
We should trigger a global job failover in case of a {{stop-with-savepoint 
--drain}} fails.

The situation is obvious in case of the with drain mode. If a savepoint fails 
we simply can not continue as we have already flushed all data and prepared the 
state for finishing. We can not simply continue processing records.

It is more debatable for without drain mode, where we could theoretically 
continue processing records, however, it is also a good approach to unify the 
two modes.

We can issue a global failover on the {{CheckpointCoordinator}}

  was:
We should trigger a global job failover in case of a {{stop-with-savepoint 
--drain}} fails.

The situation is obvious in case of the with drain mode. If a savepoint fails 
we simply can not continue as we have already flushed all data and prepared the 
state for finishing. We can not simply continue processing records.

It is more debatable for without drain mode, where we could theoretically 
continue processing records, however, it is also a good approach to unify the 
two modes.


> Trigger global failover for synchronous savepoints
> --------------------------------------------------
>
>                 Key: FLINK-23553
>                 URL: https://issues.apache.org/jira/browse/FLINK-23553
>             Project: Flink
>          Issue Type: Sub-task
>          Components: Runtime / Checkpointing
>    Affects Versions: 1.11.3, 1.13.1, 1.12.4
>            Reporter: Dawid Wysakowicz
>            Priority: Major
>             Fix For: 1.14.0
>
>
> We should trigger a global job failover in case of a {{stop-with-savepoint 
> --drain}} fails.
> The situation is obvious in case of the with drain mode. If a savepoint fails 
> we simply can not continue as we have already flushed all data and prepared 
> the state for finishing. We can not simply continue processing records.
> It is more debatable for without drain mode, where we could theoretically 
> continue processing records, however, it is also a good approach to unify the 
> two modes.
> We can issue a global failover on the {{CheckpointCoordinator}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-23553) Trigger global failover for synchronous savepoints

Reply via email to