Re: Job manager - Zookeeper HA

Till Rohrmann Mon, 19 Mar 2018 08:21:06 -0700

Hi Sampath,

it is correct that `high-availability.zookeeper.storageDir` is used to
persist job data which is needed to recover a job. This is the JobGraph
itself, its jars as well as checkpoint meta data.

The TaskManager does not need access to this directory, only the
JobManagers do. However, there is one optimization and one special case.

The optimization is when transferring blobs around in your cluster. Instead
of opening a Java socket and sending around the job jars, TaskManager check
first whether they can access the directory in which the job jars are
stored. Usually this is a DFS to which they also have access. If this is
the case, then they will access the file directly via the DFS instead of
taking a detour via the JobManager.

The special case is when using the MemoryStateBackend. When using this
state backend and if you haven't specified a checkpointing directory, then
it will use this directory to create a random folder in which it stores the
checkpoints. This assumes that the ZooKeeper storage directory is
accessible by every Flink component.

Cheers,
Till

On Mon, Mar 19, 2018 at 2:36 PM, Sampath Bhat <[email protected]>
wrote:

> Hello
>
> I'm configuring flink cluster with Zookeeper for high availability for Job
> managers.
> These are the HA related configurations-
>
> high-availability: zookeeper
> high-availability.cluster-id: flink
> high-availability.zookeeper.quorum: localhost:2181
> high-availability.zookeeper.storageDir: file:///home/HA/
>
> My concern is that high-availability.zookeeper.storageDir file location is
> accessed by only job managers or even the task managers will access it.
>
> My understanding was that the shared file location is used for storing of
> job manager meta data such as execution states and there is no need for
> task managers to access it.
> Correct me if I'm wrong.
>

Re: Job manager - Zookeeper HA

Reply via email to