Hello Kelly,

I thought that Flink scheduler only starts a job if all requested
containers/TMs are available and allotted to that job.

How can I reproduce your issue on Flink with YARN?

Thank you,

Piper


On Thu, Nov 21, 2019, 1:48 PM Kelly Smith <kell...@zillowgroup.com> wrote:

> I’ve been running Flink in production on EMR (YARN) for some time and have
> found the metrics system to be quite useful, but there is one specific case
> where I’m missing a signal for this scenario:
>
>
>
>    - When a job has been submitted, but YARN does not have enough
>    resources to provide
>
>
>
> Observed:
>
>    - Job is in RUNNING state
>    - All of the tasks for the job are in the (I believe) DEPLOYING state
>
>
>
> Is there a way to access these as metrics for monitoring the number of
> tasks in each state for a given job (image below)? The metric I’m currently
> using is the number of running jobs, but it misses this “unhealthy”
> scenario. I realize that I could use application-level metrics (record
> counts, etc) as a proxy for this, but I’m working on providing a streaming
> platform and need all of my monitoring to be application agnostic.
>
>
>
> I can’t find anything on it in the documentation.
>
>
>
> Thanks,
>
> Kelly
>

Reply via email to