Re: Flink job server with HA

Xintong Song Mon, 03 Jun 2019 19:24:18 -0700

Hi Boris,

I think what you described that putJobGraph is not invoked in Flink job
cluster is by design and should not cause a failure of job recovering. For
a Flink job cluster, there is only one job graph to execute. Instead of
uploading job graph to an already running cluster (like in a session
cluster), the job graph in a Flink job cluster is uploaded before the
cluster is started, together with the Flink framework jars. Please refer to
MiniDispatcher and SingleJobSubmittedJobGraphStore for the details.


I think we need more information to find the root cause of your problem.
For example, can you explain what are the detailed operation steps do you
perform when you say "trying to restart a Job Master".

Thank you~

Xintong Song



On Mon, Jun 3, 2019 at 10:05 PM Boris Lublinsky <
boris.lublin...@lightbend.com> wrote:

> I am trying to experiment with Flink Job server with HA and I am noticing,
> that in this case
> method putJobGraph in the class SubmittedJobGraphStore Is never invoked.
> (I can see that it is invoked in the case of session cluster when a job is
> added)
> As a result, when I am trying to restart a Job Master, it finds no running
> jobs and is not trying to restore it.
> Am I missing something?
>
>
>
> Boris Lublinsky
> FDP Architect
> boris.lublin...@lightbend.com
> https://www.lightbend.com/
>
>

Re: Flink job server with HA

Reply via email to