[ https://issues.apache.org/jira/browse/FLINK-8506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16339560#comment-16339560 ]
Steven Zhen Wu edited comment on FLINK-8506 at 1/25/18 6:45 PM: ---------------------------------------------------------------- Till, thanks for the explanation. Looks like we should clarify the doc, which says "since job submitted". [https://ci.apache.org/projects/flink/flink-docs-master/monitoring/metrics.html] fullRestarts The total number of full restarts since this job was submitted (in milliseconds). Gauge So it seems that we don't have any metric to capture jobmanager failover. was (Author: stevenz3wu): Till, thanks for the explanation. Looks like we should clarify the doc, which says "since job submitted". [https://ci.apache.org/projects/flink/flink-docs-master/monitoring/metrics.html] fullRestarts The total number of full restarts since this job was submitted (in milliseconds). Gauge So it seems that we don't any metric to capture jobmanager failover. > fullRestarts Gauge not incremented when jobmanager got killed > ------------------------------------------------------------- > > Key: FLINK-8506 > URL: https://issues.apache.org/jira/browse/FLINK-8506 > Project: Flink > Issue Type: Bug > Reporter: Steven Zhen Wu > Priority: Major > > [~till.rohrmann] When jobmanager node got killed, it will cause job restart. > But in this case, we didn't see _fullRestarts_ guage got incremented. is this > expected or a bug? -- This message was sent by Atlassian JIRA (v7.6.3#76005)