Re: Flink High-Availability and Job-Manager recovery

Koffman, Noa (Nokia - IL/Kfar Sava) Mon, 14 Feb 2022 14:53:14 -0800

Hi, thanks for your reply, it was very helpful.

we tried to go with the 2nd approach, enabling HA mode, and added these conf 
values:
    high-availability: zookeeper
    high-availability.zookeeper.quorum: zk-noa-edge-infra:2181
    high-availability.zookeeper.path.root: /flink
    high-availability.cluster-id: /flink
    high-availability.storageDir: /flink_state
    high-availability.jobmanager.port: 6150


for the storageDir, we are using a k8s persistent volume with ReadWriteOnce



Recovery of job-manager failure is working now, but it looks like there are 
issues with the task-managers:

The same configuration file is used in the task-managers as well, and there are 
a lot of error in the task-manager’s logs –

java.io.FileNotFoundException: 
/flink_state/flink/blob/job_9f4be579c7ab79817e25ed56762b7623/blob_p-5cf39313e388d9120c235528672fd267105be0e0-938e4347a98aa6166dc2625926fdab56
 (No such file or directory)



It seems that the taskmanagers are trying to access the jobmanager’s storage 
dir – can this be avoided?

The task manager does not have access to the job manager persistent volume – is 
this mandatory?

If we don’t have the option to use shared storage, is there a way to make 
zookeeper hold and manage the job states, instead of using the shared storage?



Thank

Noa






From: bastien dine <bastien.d...@gmail.com>
Date: Friday, 4 February 2022 at 10:56
To: Koffman, Noa (Nokia - IL/Kfar Sava) <noa.koff...@nokia.com>
Cc: user@flink.apache.org <user@flink.apache.org>
Subject: Re: Flink High-Availability and Job-Manager recovery
Hello,

On k8s the current recommendation is to set up 1 job manager with H-A enabled, 
so that cluster do not lost state upon crash

1. The storage dir can for sure be on kube PV, the directory should be shared 
within all JM, you will need to map the volume to the same local directory (e.g 
/data) so that the configuration amongst JM is the same
2. You can have only 1 JM, but you still need to enabled HA, since HA will 
write the cluster state into ZK & storage dir
3. I don't know anything about beam, so I can not help you with that,
But per-job mode will not be available on k8s (neither native nor standalone 
kube) 
https://nightlies.apache.org/flink/flink-docs-release-1.13/docs/deployment/resource-providers/standalone/kubernetes/#per-job-mode
 & 
https://nightlies.apache.org/flink/flink-docs-release-1.13/docs/deployment/resource-providers/native_kubernetes/#per-job-mode,
 you will need YARN to do so (i think MESOS is deprecated)
Application mode can be a bit tricky to understand, it will "move" the submit 
of the job inside the JM
The chosen solution will depends on your deployment needs, I can't tell you 
without knowing more,
But going into session mode + streaming job deployment is pretty standard and 
you can easily emulate "one cluster per job" with it (better for ops & tuning 
matters than a cluster with multiple jobs)

Hope this can help,
Regards,
Bastien


Le jeu. 3 févr. 2022 à 15:12, Koffman, Noa (Nokia - IL/Kfar Sava) 
<noa.koff...@nokia.com<mailto:noa.koff...@nokia.com>> a écrit :
Hi all,
We are currently deploying flink on k8s 3 nodes cluster - with 1 job-manager 
and 3 task managers
We are trying to understand the recommendation for deployment, more 
specifically for recovery from job-manager failure, and have some questions 
about that:


  1.  If we use flink HA solution (either Kubernetes-HA or zookeeper), the 
documentation states we should define the ‘high-availability.storageDir

In the examples we found, there is mostly hdfs or s3 storage.

We were wondering if we could use Kubernetes PersistentVolumes and 
PersistentVolumeClaims, if we do use that, can each job-manager have its own 
volume? Or it must be shared?

  1.  Is there a solution for jobmanager recovery without HA? With the way our 
flink is currenly configured, killing the job-manager pod, all the jobs are 
lost.

Is there a way to configure the job-manager so that if it goes down and k8s 
restarts it, it will continue from the same state (restart all the tasks, etc…)?

For this, can a Persistent Volume be used, without HDFS or external solutions?

  1.  Regarding the deployment mode: we are working with beam + flink, and 
flink is running in session mode, we have a few long running streaming 
pipelines deployed (less then 10).

Is ‘session’ mode the right deployment mode for our type of deployment? Or 
should we consider switching to something different? (Per-job/application)



Thanks

Re: Flink High-Availability and Job-Manager recovery

Reply via email to