HA HDFS

2019-02-06 Thread Steven Nelson
I am working on a POC High Availability installation of Flink on top of
Kubernetes with HDFS as a data storage location. I am not finding much
documentation on doing this, or I am finding the documentation in parts and
maybe getting it put together correctly. I think it falls between being an
HDFS thing and a Flink thing.

I am deploying to Kubernetes using the flink:1.7.0-hadoop27-scala_2.11
container off of docker hub.

I think these are the things I need to do
1) Setup an hdfs-site.xml file per
https://hadoop.apache.org/docs/r2.4.1/hadoop-project-dist/hadoop-hdfs/HDFSHighAvailabilityWithQJM.html#Deployment
2) Set the HADOOP_CONF_DIR environment variable to the location of that
file per
https://ci.apache.org/projects/flink/flink-docs-stable/ops/config.html#hdfs
3) Create a flink-conf.yaml file that looks something like
fs.default-scheme: hdfs://
state.backend: rocksdb
state.savepoints.dir: hdfs://flink/savepoints
state.checkpoints.dir: hdfs://flink/checkpoints
4) Dance a little jig when it works.

Has anyone set this up? If so, am I missing anything?

-Steve


Re: HA HDFS

2019-02-11 Thread Konstantin Knauf
Hi Steve,

HDFS can be used as checkpoint storage and plays a crucial role in Flink's
fault tolerance mechanism. HDFS alone will not suffice to get a Flink HA
setup, though. You also need Zookeeper for JobManager HA.

Flink configuration:

high-availability: zookeeper
high-availability.zookeeper.quorum: :
high-availability.cluster-id: /cluster_one # important: customize per cluster
high-availability.storageDir: hdfs:///flink/recovery


Regarding the steps to setup checkpointing to HDFS. You don't need to set
the "fs.default-scheme". As "state.backend" you use either "rocksdb" or
"filesystem". They both use HFDS for checkpoints. You can also set
"env.hadoop.conf.dir" in the Flink configuration as opposed to the
environment variable. The environment variable will then be set by the
Flink scripts.

Of course, in a basic Kubernetes setup TaskManagers and JobManager pods
should both be controlled by a Deployment, so that they are brought up
again after a fault.

Hope this helps, and please let me know if run into any issues.

Best,

Konstantin


On Wed, Feb 6, 2019 at 7:47 PM Steven Nelson 
wrote:

> I am working on a POC High Availability installation of Flink on top of
> Kubernetes with HDFS as a data storage location. I am not finding much
> documentation on doing this, or I am finding the documentation in parts and
> maybe getting it put together correctly. I think it falls between being an
> HDFS thing and a Flink thing.
>
> I am deploying to Kubernetes using the flink:1.7.0-hadoop27-scala_2.11
> container off of docker hub.
>
> I think these are the things I need to do
> 1) Setup an hdfs-site.xml file per
> https://hadoop.apache.org/docs/r2.4.1/hadoop-project-dist/hadoop-hdfs/HDFSHighAvailabilityWithQJM.html#Deployment
> 2) Set the HADOOP_CONF_DIR environment variable to the location of that
> file per
> https://ci.apache.org/projects/flink/flink-docs-stable/ops/config.html#hdfs
> 3) Create a flink-conf.yaml file that looks something like
> fs.default-scheme: hdfs://
> state.backend: rocksdb
> state.savepoints.dir: hdfs://flink/savepoints
> state.checkpoints.dir: hdfs://flink/checkpoints
> 4) Dance a little jig when it works.
>
> Has anyone set this up? If so, am I missing anything?
>
> -Steve
>
>
>

-- 

Konstantin Knauf | Solutions Architect

+49 160 91394525



Follow us @VervericaData

--

Join Flink Forward  - The Apache Flink
Conference

Stream Processing | Event Driven | Real Time

--

Data Artisans GmbH | Invalidenstrasse 115, 10115 Berlin, Germany

--
Data Artisans GmbH
Registered at Amtsgericht Charlottenburg: HRB 158244 B
Managing Directors: Dr. Kostas Tzoumas, Dr. Stephan Ewen


Re: HA HDFS

2019-02-11 Thread Vishal Santoshi
One more thing I had to do.

In HA set up, the TMs are not able to resolve the job manager's random port
through the jobmanager.rpc.port

setting.
Setup high-availability.jobmanager.port as a predefined port in your
flink-conf.yaml and expose the port via job-manager-deployment and
job-manager-service resources as well.

On Mon, Feb 11, 2019 at 4:09 AM Konstantin Knauf 
wrote:

> Hi Steve,
>
> HDFS can be used as checkpoint storage and plays a crucial role in Flink's
> fault tolerance mechanism. HDFS alone will not suffice to get a Flink HA
> setup, though. You also need Zookeeper for JobManager HA.
>
> Flink configuration:
>
> high-availability: zookeeper
> high-availability.zookeeper.quorum: :
> high-availability.cluster-id: /cluster_one # important: customize per cluster
> high-availability.storageDir: hdfs:///flink/recovery
>
>
> Regarding the steps to setup checkpointing to HDFS. You don't need to set
> the "fs.default-scheme". As "state.backend" you use either "rocksdb" or
> "filesystem". They both use HFDS for checkpoints. You can also set
> "env.hadoop.conf.dir" in the Flink configuration as opposed to the
> environment variable. The environment variable will then be set by the
> Flink scripts.
>
> Of course, in a basic Kubernetes setup TaskManagers and JobManager pods
> should both be controlled by a Deployment, so that they are brought up
> again after a fault.
>
> Hope this helps, and please let me know if run into any issues.
>
> Best,
>
> Konstantin
>
>
> On Wed, Feb 6, 2019 at 7:47 PM Steven Nelson 
> wrote:
>
>> I am working on a POC High Availability installation of Flink on top of
>> Kubernetes with HDFS as a data storage location. I am not finding much
>> documentation on doing this, or I am finding the documentation in parts and
>> maybe getting it put together correctly. I think it falls between being an
>> HDFS thing and a Flink thing.
>>
>> I am deploying to Kubernetes using the flink:1.7.0-hadoop27-scala_2.11
>> container off of docker hub.
>>
>> I think these are the things I need to do
>> 1) Setup an hdfs-site.xml file per
>> https://hadoop.apache.org/docs/r2.4.1/hadoop-project-dist/hadoop-hdfs/HDFSHighAvailabilityWithQJM.html#Deployment
>> 2) Set the HADOOP_CONF_DIR environment variable to the location of that
>> file per
>> https://ci.apache.org/projects/flink/flink-docs-stable/ops/config.html#hdfs
>> 3) Create a flink-conf.yaml file that looks something like
>> fs.default-scheme: hdfs://
>> state.backend: rocksdb
>> state.savepoints.dir: hdfs://flink/savepoints
>> state.checkpoints.dir: hdfs://flink/checkpoints
>> 4) Dance a little jig when it works.
>>
>> Has anyone set this up? If so, am I missing anything?
>>
>> -Steve
>>
>>
>>
>
> --
>
> Konstantin Knauf | Solutions Architect
>
> +49 160 91394525
>
> 
>
> Follow us @VervericaData
>
> --
>
> Join Flink Forward  - The Apache Flink
> Conference
>
> Stream Processing | Event Driven | Real Time
>
> --
>
> Data Artisans GmbH | Invalidenstrasse 115, 10115 Berlin, Germany
>
> --
> Data Artisans GmbH
> Registered at Amtsgericht Charlottenburg: HRB 158244 B
> Managing Directors: Dr. Kostas Tzoumas, Dr. Stephan Ewen
>


Re: HA HDFS

2019-02-12 Thread Steven Nelson

I had gotten everything setup with HDFS and Zookeeper. The problem I was having 
is that I was hard coding the online namenode. This would cause everything to 
fall apart in the event of a failover. I needed to configure Flink to use the 
hdfs config file in order to allow for correct failover. I could then remove 
the namenode from the Flink config file entirely.

Sent from my iPhone

> On Feb 11, 2019, at 2:58 AM, Konstantin Knauf  
> wrote:
> 
> Hi Steve, 
> 
> HDFS can be used as checkpoint storage and plays a crucial role in Flink's 
> fault tolerance mechanism. HDFS alone will not suffice to get a Flink HA 
> setup, though. You also need Zookeeper for JobManager HA. 
> 
> Flink configuration:
> 
> high-availability: zookeeper
> high-availability.zookeeper.quorum: :
> high-availability.cluster-id: /cluster_one # important: customize per cluster 
> high-availability.storageDir: hdfs:///flink/recovery 
> 
> Regarding the steps to setup checkpointing to HDFS. You don't need to set the 
> "fs.default-scheme". As "state.backend" you use either "rocksdb" or 
> "filesystem". They both use HFDS for checkpoints. You can also set 
> "env.hadoop.conf.dir" in the Flink configuration as opposed to the 
> environment variable. The environment variable will then be set by the Flink 
> scripts.
> 
> Of course, in a basic Kubernetes setup TaskManagers and JobManager pods 
> should both be controlled by a Deployment, so that they are brought up again 
> after a fault.
> 
> Hope this helps, and please let me know if run into any issues.
> 
> Best, 
> 
> Konstantin
> 
> 
>> On Wed, Feb 6, 2019 at 7:47 PM Steven Nelson  
>> wrote:
>> I am working on a POC High Availability installation of Flink on top of 
>> Kubernetes with HDFS as a data storage location. I am not finding much 
>> documentation on doing this, or I am finding the documentation in parts and 
>> maybe getting it put together correctly. I think it falls between being an 
>> HDFS thing and a Flink thing. 
>> 
>> I am deploying to Kubernetes using the flink:1.7.0-hadoop27-scala_2.11 
>> container off of docker hub. 
>> 
>> I think these are the things I need to do
>> 1) Setup an hdfs-site.xml file per 
>> https://hadoop.apache.org/docs/r2.4.1/hadoop-project-dist/hadoop-hdfs/HDFSHighAvailabilityWithQJM.html#Deployment
>> 2) Set the HADOOP_CONF_DIR environment variable to the location of that file 
>> per 
>> https://ci.apache.org/projects/flink/flink-docs-stable/ops/config.html#hdfs
>> 3) Create a flink-conf.yaml file that looks something like
>> fs.default-scheme: hdfs://
>> state.backend: rocksdb
>> state.savepoints.dir: hdfs://flink/savepoints
>> state.checkpoints.dir: hdfs://flink/checkpoints
>> 4) Dance a little jig when it works.
>> 
>> Has anyone set this up? If so, am I missing anything?
>> 
>> -Steve
>> 
>> 
> 
> 
> -- 
> Konstantin Knauf | Solutions Architect
> +49 160 91394525
> 
> 
> 
> Follow us @VervericaData
> --
> Join Flink Forward - The Apache Flink Conference
> Stream Processing | Event Driven | Real Time
> --
> Data Artisans GmbH | Invalidenstrasse 115, 10115 Berlin, Germany
> --
> Data Artisans GmbH
> Registered at Amtsgericht Charlottenburg: HRB 158244 B
> Managing Directors: Dr. Kostas Tzoumas, Dr. Stephan Ewen