Re: datanode is unable to connect to namenode

2016-07-06 Thread Gurmukh Singh

Have you configured JCE ?


On 01/07/16 6:36 AM, Aneela Saleem wrote:

Thanks Vinaykumar and Gurmukh,

I have made it working successfully through auth_to_local configs. But 
i faced much issues.


Actually I have two nodes cluster, one being namenode and datanode, 
and second being datanode only. I faced some authentication from 
keytab related issues. for example:


I added nn/hadoop-master to both nn.keytab and dn.keytab and did the 
same with dn/hadoop-slave (following you github dn.keytab file). but 
when i start cluster i got following error:


*Login failure for nn/hadoop-master@platalyticsrealm from keytab 
/etc/hadoop/conf/hdfs.keytab: 
javax.security.auth.login.LoginException: Checksum failed*

*
*
i verified the authentication of nn/hadoop-master with keytab 
nn/hadoop-master through kinit, but couldn't do so. because of error 
like *could not verify credentials*


i then removed nn/hadoop-master from dn.keytab then it authenticated 
successfully. And i removed all the hadoop-master principals from 
dn.keytab and hadoop-slave principals from nn.keytab.  So does it mean 
that a principal can't belong to more than one keytabs? And please 
make some time to review the attached hdfs-site.xml for both namenode 
and datanode, and keytab files. And point me out if something wrong.


Thanks


On Thu, Jun 30, 2016 at 1:21 PM, Vinayakumar B 
> wrote:


Please note, there are two different configs.

“dfs.datanode.kerberos.principal” and
“dfs.namenode.kerberos.principal”

Following configs can be set, as required.

dfs.datanode.kerberos.principal àdn/_HOST

dfs.namenode.kerberos.principal ànn/_HOST

“nn/_HOST” will be used only in namenode side.

-Vinay

*From:*Aneela Saleem [mailto:ane...@platalytics.com
]
*Sent:* 30 June 2016 13:24
*To:* Vinayakumar B >
*Cc:* user@hadoop.apache.org 


*Subject:* Re: datanode is unable to connect to namenode

Thanks Vinayakumar

Yes you got it right i was using different principal names i.e.,
*nn/_HOST* for namenode and *dn/_HOST* for datanode. Setting the
same principal name for both datanode and namenode i.e.,
hdfs/_HOST@platalyticsrealm solved the issue. Now datanode

can connect to namenode successfully.

So my question is, is it mandatory to have same principal name on
all hosts i.e., hdfs/_HOST@platalyticsrealm, because i found in many

tutorials that the convention is to have different principals for
all services like

dn/_HOST for datanode

nn/_HOST for namenode

sn/_HOST for secondarynamenode etc

Secondly for map reduce and yarn, would that mapred-site.xml and
yarn-site.xml be same on all cluster nodes? just like for
hdfs-site.xml

Thanks

On Thu, Jun 30, 2016 at 10:51 AM, Vinayakumar B
> wrote:

Hi Aneela,

1. Looks like you have attached the hdfs-site.xml from
'hadoop-master' node. For this node datanode connection is
successfull as mentioned in below logs.

 2016-06-29 10:01:35,700 INFO
SecurityLogger.org.apache.hadoop.ipc.Server: Auth successful
for nn/hadoop-master@platalyticsrealm (auth:KERBEROS)

2016-06-29 10:01:35,744 INFO

SecurityLogger.org.apache.hadoop.security.authorize.ServiceAuthorizationManager:
Authorization successful for nn/hadoop-master@platalyticsrealm
(auth:KERBEROS) for protocol=interface
org.apache.hadoop.hdfs.server.protocol.DatanodeProtocol

 2016-06-29 10:01:36,845 INFO
org.apache.hadoop.net.NetworkTopology: Adding a new node:
/default-rack/192.168.23.206:1004 

2. For the other node, 'hadoop-slave' kerberos athentication
is successfull, but ServiceAuthorizationManager check failed.

2016-06-29 10:01:37,474 INFO
SecurityLogger.org.apache.hadoop.ipc.Server: Auth successful
for dn/hadoop-slave@platalyticsrealm (auth:KERBEROS)

2016-06-29 10:01:37,512 WARN

SecurityLogger.org.apache.hadoop.security.authorize.ServiceAuthorizationManager:
Authorization failed for dn/hadoop-slave@platalyticsrealm
(auth:KERBEROS) for protocol=interface
org.apache.hadoop.hdfs.server.protocol.DatanodeProtocol,
expected client Kerberos principal is
nn/hadoop-slave@platalyticsrealm

2016-06-29 10:01:37,514 INFO org.apache.hadoop.ipc.Server:
Connection from 192.168.23.207:32807
for protocol
org.apache.hadoop.hdfs.server.protocol.DatanodeProtocol is
unauthorized for user dn/hadoop-slave@platalyticsrealm
(auth:KERBEROS)

reason could be mostly, 

Re: Question regarding WebHDFS security

2016-07-06 Thread Larry McCay
Hi Ben -

It doesn’t really work exactly that way but will likely be able to handle your 
usecase.
I suggest that you bring the conversation over to the dev@ for Knox.

We can delve into the details of your usecase and your options there.

thanks,

—larry

On Jul 5, 2016, at 10:58 PM, Benjamin Ross 
> wrote:

Thanks Larry.  I'll need to look into the details quite a bit further, but I 
take it that I can define some mapping such that requests for particular file 
paths will trigger particular credentials to be used (until everything's 
upgraded)?  Currently all requests come in using permissive auth with username 
yarn.  Once we enable Kerberos, I'd optimally like for that to translate to use 
some set of Kerberos credentials if the path is /foo and some other set of 
credentials if the path is /bar.  This will only be temporary until things are 
fully upgraded.

Appreciate the help.
Ben



From: Larry McCay [lmc...@hortonworks.com]
Sent: Tuesday, July 05, 2016 4:23 PM
To: Benjamin Ross
Cc: David Morel; user@hadoop.apache.org
Subject: Re: Question regarding WebHDFS security

For consuming REST APIs like webhdfs, where kerberos is inconvenient or 
impossible, you may want to consider using a trusted proxy like Apache Knox.
It will authenticate as knox to the backend services and act on behalf of your 
custom services.
It will also allow you to authenticate to Knox from the services using a number 
of different mechanisms.

http://knox.apache.org

On Jul 5, 2016, at 2:43 PM, Benjamin Ross 
> wrote:

Hey David,
Thanks.  Yep - that's the easy part.  Let me clarify.

Consider that we have:
1. A Hadoop cluster running without Kerberos
2. A number of services contacting that hadoop cluster and retrieving data from 
it using WebHDFS.

Clearly the services don't need to login to WebHDFS using credentials because 
the cluster isn't kerberized just yet.

Now what happens when we enable Kerberos on the cluster?  We still need to 
allow those services to contact the cluster without credentials until we can 
upgrade them.  Otherwise we'll have downtime.  So what can we do?

As a possible solution, is there any way to allow unprotected access from just 
those machines until we can upgrade them?

Thanks,
Ben






From: David Morel [dmo...@amakuru.net]
Sent: Tuesday, July 05, 2016 2:33 PM
To: Benjamin Ross
Cc: user@hadoop.apache.org
Subject: Re: Question regarding WebHDFS security


Le 5 juil. 2016 7:42 PM, "Benjamin Ross" 
> a écrit :
>
> All,
> We're planning the rollout of kerberizing our hadoop cluster.  The issue is 
> that we have several single tenant services that rely on contacting the HDFS 
> cluster over WebHDFS without credentials.  So, the concern is that once we 
> kerberize the cluster, we will no longer be able to access it without 
> credentials from these single-tenant systems, which results in a painful 
> upgrade dependency.
>
> Any suggestions for dealing with this problem in a simple way?
>
> If not, any suggestion for a better forum to ask this question?
>
> Thanks in advance,
> Ben

It's usually not super-hard to wrap your http calls with a module that handles 
Kerberos, depending on what language you use. For instance 
https://metacpan.org/pod/Net::Hadoop::WebHDFS::LWP does this.

David



Click here to report 
this email as spam.



This message has been scanned for malware by Websense. 
www.websense.com