[ https://issues.apache.org/jira/browse/FLINK-19009?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Kevin Liu updated FLINK-19009: ------------------------------ Comment: was deleted (was: I think I can fix this bug, but first we need to reach a consensus on the definition of 'a failing/recovering situation' in Flink Docs. There are 10 types of JobStatus. And 'FAILING' is the case. But what about the others? For example, 'RECONCILING'. ([~jark] What do you think, or do you know someone familiar with this part?)) > wrong way to calculate the "downtime" metric > -------------------------------------------- > > Key: FLINK-19009 > URL: https://issues.apache.org/jira/browse/FLINK-19009 > Project: Flink > Issue Type: Bug > Components: Runtime / Coordination, Runtime / Metrics > Affects Versions: 1.7.2, 1.8.0 > Reporter: Zhinan Cheng > Priority: Trivial > Fix For: 1.12.0 > > Original Estimate: 1h > Remaining Estimate: 1h > > Currently the way to calculate the Flink system metric "downtime" is not > consistent with the description in the doc, now the downtime is actually the > current timestamp minus the time timestamp when the job started. > > But Flink doc (https://flink.apache.org/gettinghelp.html) obviously describes > the time as the current timestamp minus the timestamp when the job failed. > > I believe we should update the code this metric as the Flink doc shows. The > easy way to solve this is using the current timestamp to minus the latest > uptime timestamp. -- This message was sent by Atlassian Jira (v8.3.4#803005)