Hi Jeff -

Thanks for reaching out!

Rather than try and unpack all of that, I'd like to get to step back to a
description of what you are trying to accomplish with your deployment and
the addition of Knox within it.

As you have described it, it seems like a very unsecured environment.
Whether you are running your process as a root user or not, executing your
queries and operations as the HDFS user is also very insecure.
HDFS is a superuser in a Hadoop deployment.

Authenticating to Knox as root and asserting the effective user as hdfs is
certainly we can do but I don't see what the value is of doing that.

So, let's step back and get a clear picture of what you would like to
accomplish and we can direct you to appropriate authentication/federation
providers and possibly identity-assertion providers to meet your needs.

thanks,

--larry

On Mon, Nov 18, 2019 at 2:47 PM Kevin Risden <[email protected]> wrote:

> If i am to do an hdfs query, all i need to do is to set HADOOP_USER_NAME
>> to 'hdfs' then everything works nicely.
>
>
> This means that you aren't using Kerberos just regular simple auth for
> your cluster.
>
> This is true until we get to knox. We still communicate with Knox using a
>> root and an admin password. I believe by default, this user's identity is
>> used to call webhdfs?
>>
>
> The user identity is asserted by Knox against the backend service. So Knox
> is configured for authentication that username is asserted to the backend.
> So however you are doing authentication in Knox needs to be configured.
> This is usually LDAP out of the box but can be configured with different
> authentication providers like PAM.
>
> Kevin Risden
>
>
> On Mon, Nov 18, 2019 at 2:37 PM jeff saremi <[email protected]>
> wrote:
>
>> I'm not sure how to phrase this question and also I don't have any
>> experience in these two technologies
>>
>> Here's the deal: We are switching from running hadoop and related
>> technologies from under root to a non-root user
>>
>> So far we have managed to successfully change our namenodes and datanodes
>> such that the process is running under a user named 'hdfs'.
>>
>> If i am to do an hdfs query, all i need to do is to set HADOOP_USER_NAME
>> to 'hdfs' then everything works nicely.
>>
>> This is true until we get to knox. We still communicate with Knox using a
>> root and an admin password. I believe by default, this user's identity is
>> used to call webhdfs?
>>
>> We need to change this behavior. Looking for some pointers on what the
>> changes would be.
>>
>> thanks
>> Jeff
>>
>

Reply via email to