RE: NameNode HA from a client perspective

2016-05-04 Thread Cecile, Adam
?Sandeep, Bramha,


Thanks for you answer, that helps a lot! We'll see if we go for the custom 
StandByException handling or with the 3rd party module.


Regards, Adam.


De : Sandeep Nemuri <nhsande...@gmail.com>
Envoy? : mercredi 4 mai 2016 11:18
? : Brahma Reddy Battula
Cc : Cecile, Adam; user@hadoop.apache.org
Objet : Re: NameNode HA from a client perspective

This could help you: https://pypi.python.org/pypi/PyHDFS/0.1.0

Thanks,
Sandeep
[https://mailfoogae.appspot.com/t?sender=abmhzYW5kZWVwNkBnbWFpbC5jb20%3D=zerocontent=7e3cecdd-6d1b-4dbd-ba95-140c5f11e3a2]?

On Wed, May 4, 2016 at 2:40 PM, Brahma Reddy Battula 
<brahmareddy.batt...@huawei.com<mailto:brahmareddy.batt...@huawei.com>> wrote:
1. Have a list of namenodes, built from configurations.
2. Execute the op on each namenode until its success.
3. Have the successfull namenode url as active namenode, and use the same for 
next operations.
4. Whenever a StandByException or some network exception (other than remote 
exceptions) occurs, then repeat #2 and #3, starting from the next namenode url 
in the list.


--Brahma Reddy Battula

From: Cecile, Adam [mailto:adam.cec...@hitec.lu<mailto:adam.cec...@hitec.lu>]
Sent: 04 May 2016 16:26
To: Sandeep Nemuri
Cc: user@hadoop.apache.org<mailto:user@hadoop.apache.org>
Subject: RE: NameNode HA from a client perspective


Hello,



I'm not sure to understand your answer, may I add a little piece of code:



def _build_hdfs_url(self, hdfs_path, hdfs_operation, opt_query_param_tuples=[]):

"""

:type hdfs_path: str

:type hdfs_operation: str

"""

if not hdfs_path.startswith("/"):

raise WebHdfsException("The web hdfs path must start with / but 
found " + hdfs_path, None, None)



url = 'http://' + self.host + ':' + str(self.port) + '/webhdfs/v1' + 
hdfs_path + '?user.name<http://user.name>=' + self.user + '=' + 
hdfs_operation

len_param = len(opt_query_param_tuples)

for index in range(len_param):

key_value = opt_query_param_tuples[index]

url += "&{}={}".format(key_value[0], str(key_value[1]))

return url



Here is a plain python standard distribution function extracted from an app: 
the problem here is "self.host", it has to be IP address ou DNS name of the 
NameNode, however I'd like to turn this into something dynamic resolving to the 
current active master.



Regards, Adam.




De : Sandeep Nemuri <nhsande...@gmail.com<mailto:nhsande...@gmail.com>>
Envoy? : mercredi 4 mai 2016 09:15
? : Cecile, Adam
Cc : user@hadoop.apache.org<mailto:user@hadoop.apache.org>
Objet : Re: NameNode HA from a client perspective

I think you can simply use the nameservice (dfs.nameservices) which is defined 
in hdfs-site.xml
The hdfs client should be able to resolve the current active namenode and get 
the necessary information.

Thanks,
Sandeep Nemuri
[https://mailfoogae.appspot.com/t?sender=abmhzYW5kZWVwNkBnbWFpbC5jb20%3D=zerocontent=69e2c096-0009-4482-a881-df6dfb44434f]?

On Wed, May 4, 2016 at 12:04 PM, Cecile, Adam 
<adam.cec...@hitec.lu<mailto:adam.cec...@hitec.lu>> wrote:

Hello All,


I'd like to have a piece of advice regarding how my HDFS clients should handle 
the NameNode high availability feature.
I have a complete setup running with ZKFC and I can see one active and one 
standby NameNode. When I kill the active one, the standy gets active and when 
the original one get back online it turns into a standby node, perfect.

However I'm not sure how my client apps should handle this, a couple of ideas:
* Handle the bad HTTP code from standby node to switch to the other one
* Integrate Zookeeper client to query for the current active node
* Hack something like a shared-ip linked to the active node

Then I'll have to handle a switch that may occurs during the execution of a 
client app: should I just crash and rely on the cluster to restart the job.


Thanks in advance,

Best regards from Luxembourg.?



--
  Regards
  Sandeep Nemuri



--
  Regards
  Sandeep Nemuri


Re: NameNode HA from a client perspective

2016-05-04 Thread Sandeep Nemuri
This could help you: https://pypi.python.org/pypi/PyHDFS/0.1.0

Thanks,
Sandeep
ᐧ

On Wed, May 4, 2016 at 2:40 PM, Brahma Reddy Battula <
brahmareddy.batt...@huawei.com> wrote:

> 1. Have a list of namenodes, built from configurations.
> 2. Execute the op on each namenode until its success.
> 3. Have the successfull namenode url as active namenode, and use the same
> for next operations.
> 4. Whenever a StandByException or some network exception (other than
> remote exceptions) occurs, then repeat #2 and #3, starting from the next
> namenode url in the list.
>
>
>
>
>
> --Brahma Reddy Battula
>
>
>
> *From:* Cecile, Adam [mailto:adam.cec...@hitec.lu]
> *Sent:* 04 May 2016 16:26
> *To:* Sandeep Nemuri
> *Cc:* user@hadoop.apache.org
> *Subject:* RE: NameNode HA from a client perspective
>
>
>
> Hello,
>
>
>
> I'm not sure to understand your answer, may I add a little piece of code:
>
>
>
> *def* *_build_hdfs_url*(self, hdfs_path, hdfs_operation, 
> opt_query_param_tuples*=*[]):
>
> """
>
> :type hdfs_path: str
>
> :type hdfs_operation: str
>
> """
>
> *if* *not* hdfs_path*.*startswith("/"):
>
> *raise* WebHdfsException("The web hdfs path must start with / but 
> found " *+* hdfs_path, None, None)
>
>
>
> url *=* 'http://' *+* self*.*host *+* ':' *+* str(self*.*port) *+* 
> '/webhdfs/v1' *+* hdfs_path *+* '?user.name=' *+* self*.*user *+* '=' *+* 
> hdfs_operation
>
> len_param *=* len(opt_query_param_tuples)
>
> *for* index *in* range(len_param):
>
> key_value *=* opt_query_param_tuples[index]
>
> url *+=* "&{}={}"*.*format(key_value[0], str(key_value[1]))
>
> *return* url
>
>
>
> Here is a plain python standard distribution function extracted from an
> app: the problem here is "self.host", it has to be IP address ou DNS name
> of the NameNode, however I'd like to turn this into something dynamic
> resolving to the current active master.
>
>
>
> Regards, Adam.
>
>
> --
>
> *De :* Sandeep Nemuri <nhsande...@gmail.com>
> *Envoyé :* mercredi 4 mai 2016 09:15
> *À :* Cecile, Adam
> *Cc :* user@hadoop.apache.org
> *Objet :* Re: NameNode HA from a client perspective
>
>
>
> I think you can simply use the nameservice (dfs.nameservices) which is
> defined in hdfs-site.xml
>
> The hdfs client should be able to resolve the current active namenode and
> get the necessary information.
>
>
>
> Thanks,
>
> Sandeep Nemuri
>
> ᐧ
>
>
>
> On Wed, May 4, 2016 at 12:04 PM, Cecile, Adam <adam.cec...@hitec.lu>
> wrote:
>
> Hello All,
>
>
> I'd like to have a piece of advice regarding how my HDFS clients should
> handle the NameNode high availability feature.
> I have a complete setup running with ZKFC and I can see one active and one
> standby NameNode. When I kill the active one, the standy gets active and
> when the original one get back online it turns into a standby node, perfect.
>
> However I'm not sure how my client apps should handle this, a couple of
> ideas:
> * Handle the bad HTTP code from standby node to switch to the other one
> * Integrate Zookeeper client to query for the current active node
> * Hack something like a shared-ip linked to the active node
>
> Then I'll have to handle a switch that may occurs during the execution of
> a client app: should I just crash and rely on the cluster to restart the
> job.
>
>
> Thanks in advance,
>
> Best regards from Luxembourg.​
>
>
>
>
>
> --
>
> *  Regards*
>
> *  Sandeep Nemuri*
>



-- 
*  Regards*
*  Sandeep Nemuri*


RE: NameNode HA from a client perspective

2016-05-04 Thread Brahma Reddy Battula
1. Have a list of namenodes, built from configurations.
2. Execute the op on each namenode until its success.
3. Have the successfull namenode url as active namenode, and use the same for 
next operations.
4. Whenever a StandByException or some network exception (other than remote 
exceptions) occurs, then repeat #2 and #3, starting from the next namenode url 
in the list.


--Brahma Reddy Battula

From: Cecile, Adam [mailto:adam.cec...@hitec.lu]
Sent: 04 May 2016 16:26
To: Sandeep Nemuri
Cc: user@hadoop.apache.org
Subject: RE: NameNode HA from a client perspective


Hello,



I'm not sure to understand your answer, may I add a little piece of code:



def _build_hdfs_url(self, hdfs_path, hdfs_operation, opt_query_param_tuples=[]):

"""

:type hdfs_path: str

:type hdfs_operation: str

"""

if not hdfs_path.startswith("/"):

raise WebHdfsException("The web hdfs path must start with / but 
found " + hdfs_path, None, None)



url = 'http://' + self.host + ':' + str(self.port) + '/webhdfs/v1' + 
hdfs_path + '?user.name=' + self.user + '=' + hdfs_operation

len_param = len(opt_query_param_tuples)

for index in range(len_param):

key_value = opt_query_param_tuples[index]

url += "&{}={}".format(key_value[0], str(key_value[1]))

return url



Here is a plain python standard distribution function extracted from an app: 
the problem here is "self.host", it has to be IP address ou DNS name of the 
NameNode, however I'd like to turn this into something dynamic resolving to the 
current active master.



Regards, Adam.




De : Sandeep Nemuri <nhsande...@gmail.com<mailto:nhsande...@gmail.com>>
Envoyé : mercredi 4 mai 2016 09:15
À : Cecile, Adam
Cc : user@hadoop.apache.org<mailto:user@hadoop.apache.org>
Objet : Re: NameNode HA from a client perspective

I think you can simply use the nameservice (dfs.nameservices) which is defined 
in hdfs-site.xml
The hdfs client should be able to resolve the current active namenode and get 
the necessary information.

Thanks,
Sandeep Nemuri
[https://mailfoogae.appspot.com/t?sender=abmhzYW5kZWVwNkBnbWFpbC5jb20%3D=zerocontent=69e2c096-0009-4482-a881-df6dfb44434f]ᐧ

On Wed, May 4, 2016 at 12:04 PM, Cecile, Adam 
<adam.cec...@hitec.lu<mailto:adam.cec...@hitec.lu>> wrote:

Hello All,


I'd like to have a piece of advice regarding how my HDFS clients should handle 
the NameNode high availability feature.
I have a complete setup running with ZKFC and I can see one active and one 
standby NameNode. When I kill the active one, the standy gets active and when 
the original one get back online it turns into a standby node, perfect.

However I'm not sure how my client apps should handle this, a couple of ideas:
* Handle the bad HTTP code from standby node to switch to the other one
* Integrate Zookeeper client to query for the current active node
* Hack something like a shared-ip linked to the active node

Then I'll have to handle a switch that may occurs during the execution of a 
client app: should I just crash and rely on the cluster to restart the job.


Thanks in advance,

Best regards from Luxembourg.​



--
  Regards
  Sandeep Nemuri


RE: NameNode HA from a client perspective

2016-05-04 Thread Cecile, Adam
Hello,


I'm not sure to understand your answer, may I add a little piece of code:


def _build_hdfs_url(self, hdfs_path, hdfs_operation, opt_query_param_tuples=[]):
"""
:type hdfs_path: str
:type hdfs_operation: str
"""
if not hdfs_path.startswith("/"):
raise WebHdfsException("The web hdfs path must start with / but 
found " + hdfs_path, None, None)

url = 'http://' + self.host + ':' + str(self.port) + '/webhdfs/v1' + 
hdfs_path + '?user.name=' + self.user + '=' + hdfs_operation
len_param = len(opt_query_param_tuples)
for index in range(len_param):
key_value = opt_query_param_tuples[index]
url += "&{}={}".format(key_value[0], str(key_value[1]))
return url


Here is a plain python standard distribution function extracted from an app: 
the problem here is "self.host", it has to be IP address ou DNS name of the 
NameNode, however I'd like to turn this into something dynamic resolving to the 
current active master.


Regards, Adam.



De : Sandeep Nemuri <nhsande...@gmail.com>
Envoy? : mercredi 4 mai 2016 09:15
? : Cecile, Adam
Cc : user@hadoop.apache.org
Objet : Re: NameNode HA from a client perspective

I think you can simply use the nameservice (dfs.nameservices) which is defined 
in hdfs-site.xml
The hdfs client should be able to resolve the current active namenode and get 
the necessary information.

Thanks,
Sandeep Nemuri
[https://mailfoogae.appspot.com/t?sender=abmhzYW5kZWVwNkBnbWFpbC5jb20%3D=zerocontent=69e2c096-0009-4482-a881-df6dfb44434f]?

On Wed, May 4, 2016 at 12:04 PM, Cecile, Adam 
<adam.cec...@hitec.lu<mailto:adam.cec...@hitec.lu>> wrote:

Hello All,


I'd like to have a piece of advice regarding how my HDFS clients should handle 
the NameNode high availability feature.
I have a complete setup running with ZKFC and I can see one active and one 
standby NameNode. When I kill the active one, the standy gets active and when 
the original one get back online it turns into a standby node, perfect.

However I'm not sure how my client apps should handle this, a couple of ideas:
* Handle the bad HTTP code from standby node to switch to the other one
* Integrate Zookeeper client to query for the current active node
* Hack something like a shared-ip linked to the active node

Then I'll have to handle a switch that may occurs during the execution of a 
client app: should I just crash and rely on the cluster to restart the job.


Thanks in advance,

Best regards from Luxembourg.?



--
  Regards
  Sandeep Nemuri


Re: NameNode HA from a client perspective

2016-05-04 Thread Sandeep Nemuri
I think you can simply use the nameservice (dfs.nameservices) which is
defined in hdfs-site.xml
The hdfs client should be able to resolve the current active namenode and
get the necessary information.

Thanks,
Sandeep Nemuri
ᐧ

On Wed, May 4, 2016 at 12:04 PM, Cecile, Adam  wrote:

> Hello All,
>
>
> I'd like to have a piece of advice regarding how my HDFS clients should
> handle the NameNode high availability feature.
> I have a complete setup running with ZKFC and I can see one active and one
> standby NameNode. When I kill the active one, the standy gets active and
> when the original one get back online it turns into a standby node, perfect.
>
> However I'm not sure how my client apps should handle this, a couple of
> ideas:
> * Handle the bad HTTP code from standby node to switch to the other one
> * Integrate Zookeeper client to query for the current active node
> * Hack something like a shared-ip linked to the active node
>
> Then I'll have to handle a switch that may occurs during the execution of
> a client app: should I just crash and rely on the cluster to restart the
> job.
>
>
> Thanks in advance,
>
> Best regards from Luxembourg.​
>



-- 
*  Regards*
*  Sandeep Nemuri*