@kevin, yes we're not using Kerberos or any AD

So you're saying that whatever user I authenticate against knox is the one that 
will be passed to webhdfs?

If i were to pass ?user.name=hdfs in the query string targeted for hdfs or 
?doas=hdfs in the that request, would knox honor those and pass them along? or 
will they get overwritten by Knox?
________________________________
From: larry mccay <lmc...@apache.org>
Sent: Monday, November 18, 2019 1:09 PM
To: user@knox.apache.org <user@knox.apache.org>
Subject: Re: Switching user going from KNOX to WebHDFS

Hi Jeff -

Thanks for reaching out!

Rather than try and unpack all of that, I'd like to get to step back to a 
description of what you are trying to accomplish with your deployment and the 
addition of Knox within it.

As you have described it, it seems like a very unsecured environment.
Whether you are running your process as a root user or not, executing your 
queries and operations as the HDFS user is also very insecure.
HDFS is a superuser in a Hadoop deployment.

Authenticating to Knox as root and asserting the effective user as hdfs is 
certainly we can do but I don't see what the value is of doing that.

So, let's step back and get a clear picture of what you would like to 
accomplish and we can direct you to appropriate authentication/federation 
providers and possibly identity-assertion providers to meet your needs.

thanks,

--larry

On Mon, Nov 18, 2019 at 2:47 PM Kevin Risden 
<kris...@apache.org<mailto:kris...@apache.org>> wrote:
If i am to do an hdfs query, all i need to do is to set HADOOP_USER_NAME to 
'hdfs' then everything works nicely.

This means that you aren't using Kerberos just regular simple auth for your 
cluster.

This is true until we get to knox. We still communicate with Knox using a root 
and an admin password. I believe by default, this user's identity is used to 
call webhdfs?

The user identity is asserted by Knox against the backend service. So Knox is 
configured for authentication that username is asserted to the backend. So 
however you are doing authentication in Knox needs to be configured. This is 
usually LDAP out of the box but can be configured with different authentication 
providers like PAM.

Kevin Risden


On Mon, Nov 18, 2019 at 2:37 PM jeff saremi 
<jeffsar...@hotmail.com<mailto:jeffsar...@hotmail.com>> wrote:
I'm not sure how to phrase this question and also I don't have any experience 
in these two technologies

Here's the deal: We are switching from running hadoop and related technologies 
from under root to a non-root user

So far we have managed to successfully change our namenodes and datanodes such 
that the process is running under a user named 'hdfs'.

If i am to do an hdfs query, all i need to do is to set HADOOP_USER_NAME to 
'hdfs' then everything works nicely.

This is true until we get to knox. We still communicate with Knox using a root 
and an admin password. I believe by default, this user's identity is used to 
call webhdfs?

We need to change this behavior. Looking for some pointers on what the changes 
would be.

thanks
Jeff

Reply via email to