[ 
https://issues.apache.org/jira/browse/FLINK-30558?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhu Zhu updated FLINK-30558:
----------------------------
    Fix Version/s: 1.16.1

> The metric 'numRestarts' reported in SchedulerBase will be overridden by 
> metric 'fullRestarts'
> ----------------------------------------------------------------------------------------------
>
>                 Key: FLINK-30558
>                 URL: https://issues.apache.org/jira/browse/FLINK-30558
>             Project: Flink
>          Issue Type: Bug
>          Components: Runtime / Metrics
>    Affects Versions: 1.17.0
>            Reporter: xingbe
>            Assignee: xingbe
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 1.17.0, 1.16.1
>
>
> The method SchedulerBase#registerJobMetrics register metrics 'numRestarts' 
> and 'fullRestarts' with the same metric object, as discussed in FLINK-30246, 
> that will result in the loss of the metric 'numRestarts'.
> {code:java}
> metrics.gauge(MetricNames.NUM_RESTARTS, numberOfRestarts); 
> metrics.gauge(MetricNames.FULL_RESTARTS, numberOfRestarts);{code}
> I have verified this problem via rest api /jobs/:jobid/metrics, and the 
> response shows below, we can find that the metric 'numRestarts' is missing.
> {noformat}
> [{"id":"numberOfFailedCheckpoints"},{"id":"cancellingTime"},{"id":"lastCheckpointSize"},{"id":"totalNumberOfCheckpoints"},{"id":"lastCheckpointExternalPath"},{"id":"lastCheckpointRestoreTimestamp"},{"id":"failingTime"},{"id":"runningTime"},{"id":"uptime"},{"id":"restartingTime"},{"id":"initializingTime"},{"id":"numberOfInProgressCheckpoints"},{"id":"downtime"},{"id":"lastCheckpointProcessedData"},{"id":"numberOfCompletedCheckpoints"},{"id":"deployingTime"},{"id":"lastCheckpointFullSize"},{"id":"fullRestarts"},{"id":"createdTime"},{"id":"lastCheckpointDuration"},{"id":"lastCheckpointPersistedData"}]{noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to