Question about native shared libraries distributed in release 2.6.4

2016-06-30 Thread Michael Wall
Hello

I have a question about the native libraries distributed with the 2.6.4
release.  The documentation from
http://hadoop.apache.org/docs/r2.6.4/hadoop-project-dist/hadoop-common/NativeLibraries.html
says

"The pre-built 32-bit i386-Linux native hadoop library is available as part
of the hadoop distribution and is located in the lib/native directory. You
can download the hadoop distribution from Hadoop Common Releases."

However, when I run the 'file' command on libhdfs and libhadoop included in
the binary distribution, it appears they are 64 bit

file libhadoop.so.1.0.0
libhadoop.so.1.0.0: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV),
dynamically linked, BuildID[sha1]=b3b5e812c2a91fa4b28aa33eb76dc6889d3b91e9,
not stripped

file libhdfs.so.0.0.0
libhdfs.so.0.0.0: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV),
dynamically linked, BuildID[sha1]=c08b0e042e563903a30c02d3605115421948f790,
not stripped

Is this a documentation issue or am I missing something?

Thanks

Mike


Re: datanode is unable to connect to namenode

2016-06-30 Thread Aneela Saleem
Thanks Vinaykumar and Gurmukh,

I have made it working successfully through auth_to_local configs. But i
faced much issues.

Actually I have two nodes cluster, one being namenode and datanode, and
second being datanode only. I faced some authentication from keytab related
issues. for example:

I added nn/hadoop-master to both nn.keytab and dn.keytab and did the same
with dn/hadoop-slave (following you github dn.keytab file). but when i
start cluster i got following error:

*Login failure for nn/hadoop-master@platalyticsrealm from keytab
/etc/hadoop/conf/hdfs.keytab: javax.security.auth.login.LoginException:
Checksum failed*

i verified the authentication of nn/hadoop-master with keytab
nn/hadoop-master through kinit, but couldn't do so. because of error
like *could
not verify credentials*

i then removed nn/hadoop-master from dn.keytab then it authenticated
successfully. And i removed all the hadoop-master principals from dn.keytab
and hadoop-slave principals from nn.keytab.  So does it mean that a
principal can't belong to more than one keytabs? And please make some time
to review the attached hdfs-site.xml for both namenode and datanode, and
keytab files. And point me out if something wrong.

Thanks


On Thu, Jun 30, 2016 at 1:21 PM, Vinayakumar B 
wrote:

> Please note, there are two different configs.
>
>
>
> “dfs.datanode.kerberos.principal” and “dfs.namenode.kerberos.principal”
>
>
>
> Following configs can be set, as required.
>
>
>
> dfs.datanode.kerberos.principal à dn/_HOST
>
> dfs.namenode.kerberos.principal  à nn/_HOST
>
>
>
> “nn/_HOST” will be used only in namenode side.
>
>
>
> -Vinay
>
> *From:* Aneela Saleem [mailto:ane...@platalytics.com]
> *Sent:* 30 June 2016 13:24
> *To:* Vinayakumar B 
> *Cc:* user@hadoop.apache.org
>
> *Subject:* Re: datanode is unable to connect to namenode
>
>
>
> Thanks Vinayakumar
>
>
>
> Yes you got it right i was using different principal names i.e.,
> *nn/_HOST* for namenode and *dn/_HOST* for datanode. Setting the same
> principal name for both datanode and namenode i.e.,
> hdfs/_HOST@platalyticsrealm solved the issue. Now datanode
>
> can connect to namenode successfully.
>
>
>
> So my question is, is it mandatory to have same principal name on all
> hosts i.e., hdfs/_HOST@platalyticsrealm, because i found in many
>
> tutorials that the convention is to have different principals for all
> services like
>
> dn/_HOST for datanode
>
> nn/_HOST for namenode
>
> sn/_HOST for secondarynamenode etc
>
>
>
> Secondly for map reduce and yarn, would that mapred-site.xml and
> yarn-site.xml be same on all cluster nodes? just like for hdfs-site.xml
>
>
>
> Thanks
>
>
>
> On Thu, Jun 30, 2016 at 10:51 AM, Vinayakumar B 
> wrote:
>
> Hi Aneela,
>
>
>
> 1. Looks like you have attached the hdfs-site.xml from 'hadoop-master'
> node. For this node datanode connection is successfull as mentioned in
> below logs.
>
>
>
>  2016-06-29 10:01:35,700 INFO
> SecurityLogger.org.apache.hadoop.ipc.Server: Auth successful for
> nn/hadoop-master@platalyticsrealm (auth:KERBEROS)
>
> 2016-06-29 10:01:35,744 INFO
> SecurityLogger.org.apache.hadoop.security.authorize.ServiceAuthorizationManager:
> Authorization successful for nn/hadoop-master@platalyticsrealm
> (auth:KERBEROS) for protocol=interface
> org.apache.hadoop.hdfs.server.protocol.DatanodeProtocol
>
>
>
>  2016-06-29 10:01:36,845 INFO
> org.apache.hadoop.net.NetworkTopology: Adding a new node: /default-rack/
> 192.168.23.206:1004
>
>
>
>
>
> 2. For the other node, 'hadoop-slave' kerberos athentication is
> successfull, but ServiceAuthorizationManager check failed.
>
>
>
> 2016-06-29 10:01:37,474 INFO SecurityLogger.org.apache.hadoop.ipc.Server:
> Auth successful for dn/hadoop-slave@platalyticsrealm (auth:KERBEROS)
>
> 2016-06-29 10:01:37,512 WARN
> SecurityLogger.org.apache.hadoop.security.authorize.ServiceAuthorizationManager:
> Authorization failed for dn/hadoop-slave@platalyticsrealm (auth:KERBEROS)
> for protocol=interface
> org.apache.hadoop.hdfs.server.protocol.DatanodeProtocol, expected client
> Kerberos principal is nn/hadoop-slave@platalyticsrealm
>
> 2016-06-29 10:01:37,514 INFO org.apache.hadoop.ipc.Server: Connection from
> 192.168.23.207:32807 for protocol
> org.apache.hadoop.hdfs.server.protocol.DatanodeProtocol is unauthorized for
> user dn/hadoop-slave@platalyticsrealm (auth:KERBEROS)
>
>
>
> reason could be mostly, "dfs.datanode.kerberos.principal" configuration in
> both nodes differ. I can see that this configuration in hadoop-master's
> hdfs-site.xml set to 'nn/_HOST@platalyticsrealm' but it might have been
> set to 'dn/_HOST@platalyticsrealm' in hadoop-slave node's configurations.
>
>
>
> Please change this configuration in all nodes to 'dn/_HOST@platalyticsrealm'
> and restart all NNs and DNs, and check again.
>
>
>
> If this does not help, then please share the hdfs-site.xml of hadoop-slave
> node too.
>
>
>
> 

RE: datanode is unable to connect to namenode

2016-06-30 Thread Vinayakumar B
Please note, there are two different configs.

“dfs.datanode.kerberos.principal” and “dfs.namenode.kerberos.principal”

Following configs can be set, as required.

dfs.datanode.kerberos.principal --> dn/_HOST
dfs.namenode.kerberos.principal  --> nn/_HOST

“nn/_HOST” will be used only in namenode side.

-Vinay
From: Aneela Saleem [mailto:ane...@platalytics.com]
Sent: 30 June 2016 13:24
To: Vinayakumar B 
Cc: user@hadoop.apache.org
Subject: Re: datanode is unable to connect to namenode

Thanks Vinayakumar

Yes you got it right i was using different principal names i.e., nn/_HOST for 
namenode and dn/_HOST for datanode. Setting the same principal name for both 
datanode and namenode i.e., hdfs/_HOST@platalyticsrealm solved the issue. Now 
datanode
can connect to namenode successfully.

So my question is, is it mandatory to have same principal name on all hosts 
i.e., hdfs/_HOST@platalyticsrealm, because i found in many
tutorials that the convention is to have different principals for all services 
like
dn/_HOST for datanode
nn/_HOST for namenode
sn/_HOST for secondarynamenode etc

Secondly for map reduce and yarn, would that mapred-site.xml and yarn-site.xml 
be same on all cluster nodes? just like for hdfs-site.xml

Thanks

On Thu, Jun 30, 2016 at 10:51 AM, Vinayakumar B 
> wrote:
Hi Aneela,

1. Looks like you have attached the hdfs-site.xml from 'hadoop-master' node. 
For this node datanode connection is successfull as mentioned in below logs.

 2016-06-29 10:01:35,700 INFO 
SecurityLogger.org.apache.hadoop.ipc.Server: Auth successful for 
nn/hadoop-master@platalyticsrealm (auth:KERBEROS)
2016-06-29 10:01:35,744 INFO 
SecurityLogger.org.apache.hadoop.security.authorize.ServiceAuthorizationManager:
 Authorization successful for nn/hadoop-master@platalyticsrealm (auth:KERBEROS) 
for protocol=interface org.apache.hadoop.hdfs.server.protocol.DatanodeProtocol

 2016-06-29 10:01:36,845 INFO org.apache.hadoop.net.NetworkTopology: 
Adding a new node: /default-rack/192.168.23.206:1004


2. For the other node, 'hadoop-slave' kerberos athentication is successfull, 
but ServiceAuthorizationManager check failed.

2016-06-29 10:01:37,474 INFO SecurityLogger.org.apache.hadoop.ipc.Server: Auth 
successful for dn/hadoop-slave@platalyticsrealm (auth:KERBEROS)
2016-06-29 10:01:37,512 WARN 
SecurityLogger.org.apache.hadoop.security.authorize.ServiceAuthorizationManager:
 Authorization failed for dn/hadoop-slave@platalyticsrealm (auth:KERBEROS) for 
protocol=interface org.apache.hadoop.hdfs.server.protocol.DatanodeProtocol, 
expected client Kerberos principal is nn/hadoop-slave@platalyticsrealm
2016-06-29 10:01:37,514 INFO org.apache.hadoop.ipc.Server: Connection from 
192.168.23.207:32807 for protocol 
org.apache.hadoop.hdfs.server.protocol.DatanodeProtocol is unauthorized for 
user dn/hadoop-slave@platalyticsrealm (auth:KERBEROS)

reason could be mostly, "dfs.datanode.kerberos.principal" configuration in both 
nodes differ. I can see that this configuration in hadoop-master's 
hdfs-site.xml set to 'nn/_HOST@platalyticsrealm' but it might have been set to 
'dn/_HOST@platalyticsrealm' in hadoop-slave node's configurations.

Please change this configuration in all nodes to 'dn/_HOST@platalyticsrealm' 
and restart all NNs and DNs, and check again.

If this does not help, then please share the hdfs-site.xml of hadoop-slave node 
too.

-Vinay

From: Aneela Saleem 
[mailto:ane...@platalytics.com]
Sent: 29 June 2016 21:35
To: user@hadoop.apache.org
Subject: Fwd: datanode is unable to connect to namenode



Sent from my iPhone

Begin forwarded message:
From: Aneela Saleem >
Date: 29 June 2016 at 10:16:36 GMT+5
To: "sreebalineni ." >
Subject: Re: datanode is unable to connect to namenode
Attached are the log files for datanode and namenode. Also i have attached 
hdfs-site.xml for namenode please check if there are any issues in 
configuration file.

I have following two Kerberos Principals:

nn/hadoop-master
dn/hadoop-slave

i have copied kdc.conf and krb5.conf on both nodes. Also i copied keytab file 
on datanode. And i have starting services with principal nn/hadoop-master.

On Wed, Jun 29, 2016 at 9:35 AM, sreebalineni . 
> wrote:
Probably sharing both Name node and datanode logs may help.

On Wed, Jun 29, 2016 at 10:02 AM, Aneela Saleem 
> wrote:
Following is the result of telnet

Trying 192.168.23.206...
Connected to hadoop-master.
Escape character is '^]'.

On Wed, Jun 29, 2016 at 3:57 AM, Aneela Saleem 
> wrote:
Thanks Sreebalineni for the 

Re: datanode is unable to connect to namenode

2016-06-30 Thread Gurmukh Singh
What you are doing is correct, but datanodes, in addition to the 
dn/_HOST principal also needs nn/_HOST principal.


Follow my github for configs from workign cluster: 
https://github.com/netxillon/hadoop/tree/master/kerberos




On 30/06/16 5:54 PM, Aneela Saleem wrote:

Thanks Vinayakumar

Yes you got it right i was using different principal names i.e., 
*nn/_HOST* for namenode and *dn/_HOST* for datanode. Setting the same 
principal name for both datanode and namenode i.e., 
hdfs/_HOST@platalyticsrealm solved the issue. Now datanode

can connect to namenode successfully.

So my question is, is it mandatory to have same principal name on all 
hosts i.e., hdfs/_HOST@platalyticsrealm, because i found in many
tutorials that the convention is to have different principals for all 
services like

dn/_HOST for datanode
nn/_HOST for namenode
sn/_HOST for secondarynamenode etc

Secondly for map reduce and yarn, would that mapred-site.xml and 
yarn-site.xml be same on all cluster nodes? just like for hdfs-site.xml


Thanks

On Thu, Jun 30, 2016 at 10:51 AM, Vinayakumar B 
> wrote:


Hi Aneela,

1. Looks like you have attached the hdfs-site.xml from
'hadoop-master' node. For this node datanode connection is
successfull as mentioned in below logs.

 2016-06-29 10:01:35,700 INFO
SecurityLogger.org.apache.hadoop.ipc.Server: Auth successful for
nn/hadoop-master@platalyticsrealm (auth:KERBEROS)

2016-06-29 10:01:35,744 INFO

SecurityLogger.org.apache.hadoop.security.authorize.ServiceAuthorizationManager:
Authorization successful for nn/hadoop-master@platalyticsrealm
(auth:KERBEROS) for protocol=interface
org.apache.hadoop.hdfs.server.protocol.DatanodeProtocol

 2016-06-29 10:01:36,845 INFO
org.apache.hadoop.net.NetworkTopology: Adding a new node:
/default-rack/192.168.23.206:1004 

2. For the other node, 'hadoop-slave' kerberos athentication is
successfull, but ServiceAuthorizationManager check failed.

2016-06-29 10:01:37,474 INFO
SecurityLogger.org.apache.hadoop.ipc.Server: Auth successful for
dn/hadoop-slave@platalyticsrealm (auth:KERBEROS)

2016-06-29 10:01:37,512 WARN

SecurityLogger.org.apache.hadoop.security.authorize.ServiceAuthorizationManager:
Authorization failed for dn/hadoop-slave@platalyticsrealm
(auth:KERBEROS) for protocol=interface
org.apache.hadoop.hdfs.server.protocol.DatanodeProtocol, expected
client Kerberos principal is nn/hadoop-slave@platalyticsrealm

2016-06-29 10:01:37,514 INFO org.apache.hadoop.ipc.Server:
Connection from 192.168.23.207:32807 
for protocol
org.apache.hadoop.hdfs.server.protocol.DatanodeProtocol is
unauthorized for user dn/hadoop-slave@platalyticsrealm (auth:KERBEROS)

reason could be mostly, "dfs.datanode.kerberos.principal"
configuration in both nodes differ. I can see that this
configuration in hadoop-master's hdfs-site.xml set to
'nn/_HOST@platalyticsrealm' but it might have been set to
'dn/_HOST@platalyticsrealm' in hadoop-slave node's configurations.

Please change this configuration in all nodes to
'dn/_HOST@platalyticsrealm' and restart all NNs and DNs, and check
again.

If this does not help, then please share the hdfs-site.xml of
hadoop-slave node too.

-Vinay

*From:*Aneela Saleem [mailto:ane...@platalytics.com
]
*Sent:* 29 June 2016 21:35
*To:* user@hadoop.apache.org 
*Subject:* Fwd: datanode is unable to connect to namenode



Sent from my iPhone


Begin forwarded message:

*From:*Aneela Saleem >
*Date:* 29 June 2016 at 10:16:36 GMT+5
*To:* "sreebalineni ." >
*Subject:* *Re: datanode is unable to connect to namenode*

Attached are the log files for datanode and namenode. Also i
have attached hdfs-site.xml for namenode please check if there
are any issues in configuration file.

I have following two Kerberos Principals:

nn/hadoop-master

dn/hadoop-slave

i have copied kdc.conf and krb5.conf on both nodes. Also i
copied keytab file on datanode. And i have starting services
with principal nn/hadoop-master.

On Wed, Jun 29, 2016 at 9:35 AM, sreebalineni .
> wrote:

Probably sharing both Name node and datanode logs may help.

On Wed, Jun 29, 2016 at 10:02 AM, Aneela Saleem
>
wrote:

Following is the result of telnet

Trying 192.168.23.206...

 

Re: cp command in webhdfs (and Filesystem Java Object)

2016-06-30 Thread Jérôme BAROTIN
Thanks for your response Chris, so I understand that there are no standard
implementation of cp as a REST API ?

You mention that cp is a combination of "open, create and rename" all of
theses method are available thought webhdfs. Do you think that we can re
product remotely though execute several REST call ? (I mean without
transferring data on client side)

Otherwise, if I want to build my own hdfs cp API REST (in Java), do you
think I should use the copy method of the FileUtil Object (
https://hadoop.apache.org/docs/stable/api/org/apache/hadoop/fs/FileUtil.html)
?

Best regards,

Jérôme

2016-06-29 17:36 GMT+02:00 Chris Nauroth :

> Hello Jérôme,
>
> WebHDFS provides an HTTP binding to the FileSystem API, which defines the
> primitive operations offered by the file system.  The FileSystem Shell
> builds on top of the FileSystem API to provide higher-level workflows,
> implemented using the FileSystem primitives.  In the case of "cp", copy is
> not a primitive operation defined by the FileSystem API.  Instead, the
> FileSystem Shell implements it by composing a few different FileSystem API
> primitives: open, create and rename.
>
> Due to this separation, you won't find a "cp" operation directly in the
> WebHDFS REST API (or HTTPFS).  However, it is possible for the FileSystem
> shell to reference paths as URIs using the "webhdfs" scheme.  For example:
>
> > hadoop fs -cp webhdfs://localhost:9870/hello1
> webhdfs://localhost:9870/hello2
>
> > hadoop fs -cat webhdfs://localhost:9870/hello2
> hello
>
> --Chris Nauroth
>
> From: Jérôme BAROTIN 
> Date: Wednesday, June 29, 2016 at 12:44 AM
> To: Rohan Rajeevan 
> Cc: "user@hadoop.apache.org" 
> Subject: Re: cp command in webhdfs (and Filesystem Java Object)
>
> I'm not thinking that is the same :
> - CREATE is for a local file : in my case, I just want to copy one hdfs
> path to another on the same cluster
> - Distcp, is for copying file between two differents clusters.
>
> I'm using HTTPFs/webhdfsREST API to acces to my cluster, and I need to
> execute a "cp" command. How can I do that ?
>
> Do I need to develop this service ?
>
> Jérôme
>
> 2016-06-29 8:17 GMT+02:00 Rohan Rajeevan :
>
>>
>> May be look at this?
>> https://hadoop.apache.org/docs/r1.0.4/webhdfs.html#CREATE
>> If you are interested in intra cluster copy, may look at DistCp
>> ?
>>
>> On Tue, Jun 28, 2016 at 9:36 AM, Jérôme BAROTIN 
>> wrote:
>>
>>> Hello,
>>>
>>> I'm writing this email, because, I spent one hour to look for a cp
>>> command in the webhdfs API (in fact, I'm using HTTPFS, but I think it's the
>>> same).
>>>
>>> This command is implemented in the "hdfs dfs" command line client (and
>>> I'm using this command), but, I can't find it on the webhdfs REST API. I
>>> thought that webhdfs is an implementation of the Filesystem object (
>>> https://hadoop.apache.org/docs/r2.6.1/api/org/apache/hadoop/fs/FileSystem.html).
>>> I checked at the Java API and I haven't found any cp command. The only java
>>> cp command is on the FileUtil Object (
>>> https://hadoop.apache.org/docs/stable/api/org/apache/hadoop/fs/FileUtil.html)
>>> and I'm not sure that it work identicaly than "hdfs dfs -cp" command.
>>>
>>> I also checked at the Hadoop JIRA, and I found nothing :
>>> https://issues.apache.org/jira/browse/HADOOP-9417?jql=project%20%3D%20HADOOP%20AND%20(text%20~%20%22webhdfs%20copy%22%20OR%20text%20~%20%22webhdfs%20cp%22)
>>>
>>> is there a way to execute a cp command through a REST API ?
>>>
>>> All my best,
>>>
>>>
>>> Jérôme
>>>
>>
>>
>