Thanks Stephan – we will go with a central ZooKeeper Instance and hopefully 
have it started through a cloudformation script as part of EMR startup.

Is Zk also used to keep track of checkpoint metadata and the execution graph of 
the running job to recover from ApplicationMaster failure as Aljoscha was 
guessing below or only for leader election in case of accidently running 
multiple Application Masters ?

Thanks
Ankit

From: Stephan Ewen <se...@apache.org>
Date: Monday, May 8, 2017 at 9:00 AM
To: "user@flink.apache.org" <user@flink.apache.org>, "Jain, Ankit" 
<ankit.j...@here.com>
Subject: Re: High Availability on Yarn

@Ankit:

ZooKeeper is required in YARN setups still. Even if there is only one 
JobManager in the normal case, Yarn can accidentally create a second one when 
there is a network partition.
To prevent that this leads to inconsistencies, we use ZooKeeper.

Flink uses ZooKeeper very little, so you can just let Flink attach to any 
existing ZooKeeper, or user one ZooKeeper cluster for very many Flink 
clusters/jobs.

Stephan


On Mon, May 8, 2017 at 2:11 PM, Aljoscha Krettek 
<aljos...@apache.org<mailto:aljos...@apache.org>> wrote:
Hi,
Yes, it’s recommended to use one ZooKeeper cluster for all Flink clusters.

Best,
Aljoscha

On 5. May 2017, at 16:56, Jain, Ankit 
<ankit.j...@here.com<mailto:ankit.j...@here.com>> wrote:

Thanks for the update Aljoscha.

@Till Rohrmann<mailto:trohrm...@apache.org>,
Can you please chim in?

Also, we currently have a long running EMR cluster where we create one flink 
cluster per job – can we just choose to install Zookeeper when creating the EMR 
cluster and use one Zookeeper instance for ALL of flink jobs?
Or
Recommendation is to have a dedicated Zookeeper instance per flink job?

Thanks
Ankit

From: Aljoscha Krettek <aljos...@apache.org<mailto:aljos...@apache.org>>
Date: Thursday, May 4, 2017 at 1:19 AM
To: "Jain, Ankit" <ankit.j...@here.com<mailto:ankit.j...@here.com>>
Cc: "user@flink.apache.org<mailto:user@flink.apache.org>" 
<user@flink.apache.org<mailto:user@flink.apache.org>>, Till Rohrmann 
<trohrm...@apache.org<mailto:trohrm...@apache.org>>
Subject: Re: High Availability on Yarn

Hi,
Yes, for YARN there is only one running JobManager. As far as I Know, In this 
case ZooKeeper is only used to keep track of checkpoint metadata and the 
execution graph of the running job. Such that a restoring JobManager can pick 
up the data again.

I’m not 100 % sure on this, though, so maybe Till can shed some light on this.

Best,
Aljoscha
On 3. May 2017, at 16:58, Jain, Ankit 
<ankit.j...@here.com<mailto:ankit.j...@here.com>> wrote:

Thanks for your reply Aljoscha.

After building better understanding of Yarn and spending copious amount of time 
on Flink codebase, I think I now get how Flink & Yarn interact – I plan to 
document this soon in case it could help somebody starting afresh with 
Flink-Yarn.

Regarding Zookeper, in YARN mode there is only one JobManager running, do we 
still need leader election?

If the ApplicationMaster goes down (where JM runs) it is restarted by Yarn RM 
and while restarting, Flink AM will bring back previous running containers.  
So, where does Zookeeper sit in this setup?

Thanks
Ankit

From: Aljoscha Krettek <aljos...@apache.org<mailto:aljos...@apache.org>>
Date: Wednesday, May 3, 2017 at 2:05 AM
To: "Jain, Ankit" <ankit.j...@here.com<mailto:ankit.j...@here.com>>
Cc: "user@flink.apache.org<mailto:user@flink.apache.org>" 
<user@flink.apache.org<mailto:user@flink.apache.org>>, Till Rohrmann 
<trohrm...@apache.org<mailto:trohrm...@apache.org>>
Subject: Re: High Availability on Yarn

Hi,
As a first comment, the work mentioned in the FLIP-6 doc you linked is still 
work-in-progress. You cannot use these abstractions yet without going into the 
code and setting up a cluster “by hand”.

The documentation for one-step deployment of a Job to YARN is available here: 
https://ci.apache.org/projects/flink/flink-docs-release-1.2/setup/yarn_setup.html#run-a-single-flink-job-on-yarn<https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fci.apache.org%2Fprojects%2Fflink%2Fflink-docs-release-1.2%2Fsetup%2Fyarn_setup.html%23run-a-single-flink-job-on-yarn&data=01%7C01%7C%7Cfb13823970ba4476ebbf08d49203846c%7C6d4034cd72254f72b85391feaea64919%7C1&sdata=fzBiQtv7MR2%2Fehg6GepwPa1uWxpqEgPJakto2B8k0Zk%3D&reserved=0>

Regarding your third question, ZooKeeper is mostly used for discovery and 
leader election. That is, JobManagers use it to decide who is the main JM and 
who are standby JMs. TaskManagers use it to discover the leading JobManager 
that they should connect to.

I’m also cc’ing Till, who should know this stuff better and can maybe explain 
it in a bit more detail.

Best,
Aljoscha
On 1. May 2017, at 18:59, Jain, Ankit 
<ankit.j...@here.com<mailto:ankit.j...@here.com>> wrote:

Hi fellow users,
We are trying to straighten out high availability story for flink.

Our setup includes a long running EMR cluster, job submission is a two-step 
process – 1) Flink cluster is first created using flink yarn client on the EMR 
cluster already running 2) Flink job is submitted.

I also saw references that with 1.2, these two steps have been combined into 1 
– is that change in FlinkYarnSessionCli.java? Can somebody point to 
documentation please?

W/o worrying about Yarn RM (not Flink Yarn RM that seems to be newly 
introduced) failure for now, I want to understand first how task manager & job 
manager failures are handled.

My questions-
1)       
https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=65147077<https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcwiki.apache.org%2Fconfluence%2Fpages%2Fviewpage.action%3FpageId%3D65147077&data=01%7C01%7C%7Cfb13823970ba4476ebbf08d49203846c%7C6d4034cd72254f72b85391feaea64919%7C1&sdata=29Se7mWQZ09ukF3rkQNmSRPXY4RkA8RCNO4ec4Glj8I%3D&reserved=0>
 suggests a new RM has been added and now there is one JobManager for each job. 
Since Yarn RM will now talk to Flink RM( instead of JobManager previously), 
will Yarn automatically restart failing Flink RM?
2)       Is there any documentation on behavior of new Flink RM that will come 
up? How will previously running JobManagers & TaskManagers find out about new 
RM?
3)       
https://ci.apache.org/projects/flink/flink-docs-release-1.3/setup/jobmanager_high_availability.html#configuration<https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fci.apache.org%2Fprojects%2Fflink%2Fflink-docs-release-1.3%2Fsetup%2Fjobmanager_high_availability.html%23configuration&data=01%7C01%7C%7Cfb13823970ba4476ebbf08d49203846c%7C6d4034cd72254f72b85391feaea64919%7C1&sdata=nYTYWaaWA4T1D7EwvL%2B7mwhrVcqn6xTzCv8SS6x%2FqLM%3D&reserved=0>
 requires configuring Zookeeper even for Yarn – Is this needed for handling 
Task Manager failures or JM or both? Will Yarn not take care of JM failures?

It may sound like I am little confused between role of Yarn and Flink 
components– who has the most burden of HA? Documentation in current state is 
lacking clarity – I know it is still evolving.

Please let me know if somebody can help clear the confusion.

Thanks
Ankit






Reply via email to