You could use “flink_jobmanager_numRunningJobs” to check the number of running 
jobs.

Thanks

From: Jesús Vásquez <jesusvasquezr1...@gmail.com>
Date: Monday, December 16, 2019 at 12:47 PM
To: "user@flink.apache.org" <user@flink.apache.org>
Subject: [EXTERNAL] Flink and Prometheus monitoring question

Hi,
I want to monitor Flink Streaming jobs using Prometheus
My first goal is to send alerts when a Flink job has failed.
The thing is that looking at the documentation I haven't found a metric that 
helps me defining an alerting rule.
As a starting point i thought that the metric flink_jobmanager_job_downtime 
could help since the doc says this metric emits -1 for a completed job.
But when i tested this i found out this doesn't work since the metric always 
emits 0 and after the job is completed there is no metric.
Has anyone managed to alert when flink job has failed with Prometheus?
Thanks for your help.

Reply via email to