Feifan Wang created FLINK-24384: ----------------------------------- Summary: Count checkpoints failed in trigger phase into numberOfFailedCheckpoints Key: FLINK-24384 URL: https://issues.apache.org/jira/browse/FLINK-24384 Project: Flink Issue Type: Improvement Components: Runtime / Checkpointing Reporter: Feifan Wang
h1. *Problem* In current implementation, checkpoints failed in trigger phase do not count into metric 'numberOfFailedCheckpoints'. Such that users can not aware checkpoint stoped by this metric. As lang as users can use rules like _*'numberOfCompletedCheckpoints' not increase in some minutes past*_ (maybe checkpoint interval + timeout) for alerting, but I think it is ambages and can not alert timely. h1. *Proposal* As the title, count checkpoints failed in trigger phase into 'numberOfFailedCheckpoints'. -- This message was sent by Atlassian Jira (v8.3.4#803005)