I had gotten everything setup with HDFS and Zookeeper. The problem I was having 
is that I was hard coding the online namenode. This would cause everything to 
fall apart in the event of a failover. I needed to configure Flink to use the 
hdfs config file in order to allow for correct failover. I could then remove 
the namenode from the Flink config file entirely.

Sent from my iPhone

> On Feb 11, 2019, at 2:58 AM, Konstantin Knauf <konstan...@ververica.com> 
> wrote:
> 
> Hi Steve, 
> 
> HDFS can be used as checkpoint storage and plays a crucial role in Flink's 
> fault tolerance mechanism. HDFS alone will not suffice to get a Flink HA 
> setup, though. You also need Zookeeper for JobManager HA. 
> 
> Flink configuration:
> 
> high-availability: zookeeper
> high-availability.zookeeper.quorum: <host>:<port>
> high-availability.cluster-id: /cluster_one # important: customize per cluster 
> high-availability.storageDir: hdfs:///flink/recovery 
> 
> Regarding the steps to setup checkpointing to HDFS. You don't need to set the 
> "fs.default-scheme". As "state.backend" you use either "rocksdb" or 
> "filesystem". They both use HFDS for checkpoints. You can also set 
> "env.hadoop.conf.dir" in the Flink configuration as opposed to the 
> environment variable. The environment variable will then be set by the Flink 
> scripts.
> 
> Of course, in a basic Kubernetes setup TaskManagers and JobManager pods 
> should both be controlled by a Deployment, so that they are brought up again 
> after a fault.
> 
> Hope this helps, and please let me know if run into any issues.
> 
> Best, 
> 
> Konstantin
> 
> 
>> On Wed, Feb 6, 2019 at 7:47 PM Steven Nelson <snel...@sourceallies.com> 
>> wrote:
>> I am working on a POC High Availability installation of Flink on top of 
>> Kubernetes with HDFS as a data storage location. I am not finding much 
>> documentation on doing this, or I am finding the documentation in parts and 
>> maybe getting it put together correctly. I think it falls between being an 
>> HDFS thing and a Flink thing. 
>> 
>> I am deploying to Kubernetes using the flink:1.7.0-hadoop27-scala_2.11 
>> container off of docker hub. 
>> 
>> I think these are the things I need to do
>> 1) Setup an hdfs-site.xml file per 
>> https://hadoop.apache.org/docs/r2.4.1/hadoop-project-dist/hadoop-hdfs/HDFSHighAvailabilityWithQJM.html#Deployment
>> 2) Set the HADOOP_CONF_DIR environment variable to the location of that file 
>> per 
>> https://ci.apache.org/projects/flink/flink-docs-stable/ops/config.html#hdfs
>> 3) Create a flink-conf.yaml file that looks something like
>>         fs.default-scheme: hdfs://
>>         state.backend: rocksdb
>>         state.savepoints.dir: hdfs://flink/savepoints
>>         state.checkpoints.dir: hdfs://flink/checkpoints
>> 4) Dance a little jig when it works.
>> 
>> Has anyone set this up? If so, am I missing anything?
>> 
>> -Steve
>> 
>> 
> 
> 
> -- 
> Konstantin Knauf | Solutions Architect
> +49 160 91394525
> 
> 
> 
> Follow us @VervericaData
> --
> Join Flink Forward - The Apache Flink Conference
> Stream Processing | Event Driven | Real Time
> --
> Data Artisans GmbH | Invalidenstrasse 115, 10115 Berlin, Germany
> --
> Data Artisans GmbH
> Registered at Amtsgericht Charlottenburg: HRB 158244 B
> Managing Directors: Dr. Kostas Tzoumas, Dr. Stephan Ewen    

Reply via email to