Re: Multiple hdfs

2018-05-23 Thread Rong Rong
+1 on viewfs, I was going to add that :-)

To add to this, viewfs can be use as a federation layer for supporting
multiple HDFS clusters for checkpoint/savepoint HA purpose as well.

--
Rong

On Wed, May 23, 2018 at 6:29 AM, Stephan Ewen  wrote:

> I think that Hadoop recommends to solve such setups with a viewfs:// that
> spans both HDFS clusters and then the two different clusters look like
> different paths within on file system. Similar as mounting different file
> systems into one directory tree in unix.
>
> On Tue, May 22, 2018 at 4:41 PM, Kien Truong 
> wrote:
>
>> You only need to modify the core-site and hdfs-site read by Flink.
>>
>> Regards,
>>
>> Kiên
>> On 5/22/2018 9:07 PM, Deepak Sharma wrote:
>>
>> Wouldnt 2 core-site and hdfs-site xmls need to be provided in this case
>> then ?
>>
>> Thanks
>> Deepak
>>
>> On Tue, May 22, 2018, 19:34 Raul Valdoleiros <
>> raul.valdoleiros.olive...@gmail.com> wrote:
>>
>>> Hi Kien,
>>>
>>> Thanks for you reply.
>>>
>>> Your goal is to store the checkpoints in one hdfs cluster and the data
>>> in other hdfs cluster.
>>>
>>> So the flink should be able to connect to two different hdfs clusters.
>>>
>>> Thanks
>>>
>>> 2018-05-22 15:00 GMT+01:00 Kien Truong :
>>>
 Hi,

 If your cluster are not high-availability clusters then just use the
 full path to the cluster.

 For example, to refer to directory /checkpoint on cluster1, use
 hdfs://namenode1_ip:port/checkpoint

 Like wise, /data on cluster2 will be hdfs://namenode2_ip:port/data


 If your cluster is a HA cluster, then you need to modify the
 hdfs-site.xml like section 1 of this guide

 https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.4/bk_
 administration/content/distcp_between_ha_clusters.html

 Then use the full path to the cluster hdfs://cluster1ha/checkpoint &
 hdfs://cluster2ha/data

 Regards,
 Kien


 On 5/21/2018 9:19 PM, Raul Valdoleiros wrote:

> Hi,
>
> I want to store my data in one hdfs and the flink checkpoints in
> another hdfs. I didn't find a way to do it, anyone can point me a 
> direction?
>
> Thanks in advance,
> Raul
>

>>>
>


Re: Multiple hdfs

2018-05-23 Thread Stephan Ewen
I think that Hadoop recommends to solve such setups with a viewfs:// that
spans both HDFS clusters and then the two different clusters look like
different paths within on file system. Similar as mounting different file
systems into one directory tree in unix.

On Tue, May 22, 2018 at 4:41 PM, Kien Truong 
wrote:

> You only need to modify the core-site and hdfs-site read by Flink.
>
> Regards,
>
> Kiên
> On 5/22/2018 9:07 PM, Deepak Sharma wrote:
>
> Wouldnt 2 core-site and hdfs-site xmls need to be provided in this case
> then ?
>
> Thanks
> Deepak
>
> On Tue, May 22, 2018, 19:34 Raul Valdoleiros  gmail.com> wrote:
>
>> Hi Kien,
>>
>> Thanks for you reply.
>>
>> Your goal is to store the checkpoints in one hdfs cluster and the data in
>> other hdfs cluster.
>>
>> So the flink should be able to connect to two different hdfs clusters.
>>
>> Thanks
>>
>> 2018-05-22 15:00 GMT+01:00 Kien Truong :
>>
>>> Hi,
>>>
>>> If your cluster are not high-availability clusters then just use the
>>> full path to the cluster.
>>>
>>> For example, to refer to directory /checkpoint on cluster1, use
>>> hdfs://namenode1_ip:port/checkpoint
>>>
>>> Like wise, /data on cluster2 will be hdfs://namenode2_ip:port/data
>>>
>>>
>>> If your cluster is a HA cluster, then you need to modify the
>>> hdfs-site.xml like section 1 of this guide
>>>
>>> https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.4/
>>> bk_administration/content/distcp_between_ha_clusters.html
>>>
>>> Then use the full path to the cluster hdfs://cluster1ha/checkpoint &
>>> hdfs://cluster2ha/data
>>>
>>> Regards,
>>> Kien
>>>
>>>
>>> On 5/21/2018 9:19 PM, Raul Valdoleiros wrote:
>>>
 Hi,

 I want to store my data in one hdfs and the flink checkpoints in
 another hdfs. I didn't find a way to do it, anyone can point me a 
 direction?

 Thanks in advance,
 Raul

>>>
>>


Re: Multiple hdfs

2018-05-22 Thread Kien Truong

You only need to modify the core-site and hdfs-site read by Flink.

Regards,

Kiên

On 5/22/2018 9:07 PM, Deepak Sharma wrote:
Wouldnt 2 core-site and hdfs-site xmls need to be provided in this 
case then ?


Thanks
Deepak

On Tue, May 22, 2018, 19:34 Raul Valdoleiros 
> wrote:


Hi Kien,

Thanks for you reply.

Your goal is to store the checkpoints in one hdfs cluster and the
data in other hdfs cluster.

So the flink should be able to connect to two different hdfs clusters.

Thanks

2018-05-22 15:00 GMT+01:00 Kien Truong mailto:duckientru...@gmail.com>>:

Hi,

If your cluster are not high-availability clusters then just
use the full path to the cluster.

For example, to refer to directory /checkpoint on cluster1,
use hdfs://namenode1_ip:port/checkpoint

Like wise, /data on cluster2 will be hdfs://namenode2_ip:port/data


If your cluster is a HA cluster, then you need to modify the
hdfs-site.xml like section 1 of this guide


https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.4/bk_administration/content/distcp_between_ha_clusters.html

Then use the full path to the cluster
hdfs://cluster1ha/checkpoint & hdfs://cluster2ha/data

Regards,
Kien


On 5/21/2018 9:19 PM, Raul Valdoleiros wrote:

Hi,

I want to store my data in one hdfs and the flink
checkpoints in another hdfs. I didn't find a way to do it,
anyone can point me a direction?

Thanks in advance,
Raul




Re: Multiple hdfs

2018-05-22 Thread Deepak Sharma
Wouldnt 2 core-site and hdfs-site xmls need to be provided in this case
then ?

Thanks
Deepak

On Tue, May 22, 2018, 19:34 Raul Valdoleiros <
raul.valdoleiros.olive...@gmail.com> wrote:

> Hi Kien,
>
> Thanks for you reply.
>
> Your goal is to store the checkpoints in one hdfs cluster and the data in
> other hdfs cluster.
>
> So the flink should be able to connect to two different hdfs clusters.
>
> Thanks
>
> 2018-05-22 15:00 GMT+01:00 Kien Truong :
>
>> Hi,
>>
>> If your cluster are not high-availability clusters then just use the full
>> path to the cluster.
>>
>> For example, to refer to directory /checkpoint on cluster1, use
>> hdfs://namenode1_ip:port/checkpoint
>>
>> Like wise, /data on cluster2 will be hdfs://namenode2_ip:port/data
>>
>>
>> If your cluster is a HA cluster, then you need to modify the
>> hdfs-site.xml like section 1 of this guide
>>
>>
>> https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.4/bk_administration/content/distcp_between_ha_clusters.html
>>
>> Then use the full path to the cluster hdfs://cluster1ha/checkpoint &
>> hdfs://cluster2ha/data
>>
>> Regards,
>> Kien
>>
>>
>> On 5/21/2018 9:19 PM, Raul Valdoleiros wrote:
>>
>>> Hi,
>>>
>>> I want to store my data in one hdfs and the flink checkpoints in another
>>> hdfs. I didn't find a way to do it, anyone can point me a direction?
>>>
>>> Thanks in advance,
>>> Raul
>>>
>>
>


Re: Multiple hdfs

2018-05-22 Thread Raul Valdoleiros
Hi Kien,

Thanks for you reply.

Your goal is to store the checkpoints in one hdfs cluster and the data in
other hdfs cluster.

So the flink should be able to connect to two different hdfs clusters.

Thanks

2018-05-22 15:00 GMT+01:00 Kien Truong :

> Hi,
>
> If your cluster are not high-availability clusters then just use the full
> path to the cluster.
>
> For example, to refer to directory /checkpoint on cluster1, use
> hdfs://namenode1_ip:port/checkpoint
>
> Like wise, /data on cluster2 will be hdfs://namenode2_ip:port/data
>
>
> If your cluster is a HA cluster, then you need to modify the hdfs-site.xml
> like section 1 of this guide
>
> https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.4/bk_
> administration/content/distcp_between_ha_clusters.html
>
> Then use the full path to the cluster hdfs://cluster1ha/checkpoint &
> hdfs://cluster2ha/data
>
> Regards,
> Kien
>
>
> On 5/21/2018 9:19 PM, Raul Valdoleiros wrote:
>
>> Hi,
>>
>> I want to store my data in one hdfs and the flink checkpoints in another
>> hdfs. I didn't find a way to do it, anyone can point me a direction?
>>
>> Thanks in advance,
>> Raul
>>
>


Re: Multiple hdfs

2018-05-22 Thread Kien Truong

Hi,

If your cluster are not high-availability clusters then just use the 
full path to the cluster.


For example, to refer to directory /checkpoint on cluster1, use 
hdfs://namenode1_ip:port/checkpoint


Like wise, /data on cluster2 will be hdfs://namenode2_ip:port/data


If your cluster is a HA cluster, then you need to modify the 
hdfs-site.xml like section 1 of this guide


https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.4/bk_administration/content/distcp_between_ha_clusters.html

Then use the full path to the cluster hdfs://cluster1ha/checkpoint & 
hdfs://cluster2ha/data


Regards,
Kien

On 5/21/2018 9:19 PM, Raul Valdoleiros wrote:

Hi,

I want to store my data in one hdfs and the flink checkpoints in 
another hdfs. I didn't find a way to do it, anyone can point me a 
direction?


Thanks in advance,
Raul