Knox will scrub any incoming query params that are used to potentially
spoof other users.
Both user.name and doas/doAs will be scrubbed from the incoming request and
the appropriate query param will be used for the dispatch to the backend
service.

You can look at the default identity-assertion [1] provider principal
mapping capabilities to map an authenticated principal/username to another
user known within the cluster. This will allow you to authenticate as one
user and the effective user within the cluster will be the mapped user.

1.
http://knox.apache.org/books/knox-1-3-0/user-guide.html#Default+Identity+Assertion+Provider

On Mon, Nov 18, 2019 at 4:15 PM jeff saremi <[email protected]> wrote:

> @Larry, yes your description is pretty well.
> We have different "levels" of security. This is the lowest one and most
> unsecure however since this is on-prem product some customers woulnd't mind
> accepting the risk
>
> We also have Kerberos and AD on top of these for those want a more secure
> environment.
>
> Currently, we need Knox to allow us pass the superuser identity if
> possible.
> thanks
> ------------------------------
> *From:* jeff saremi <[email protected]>
> *Sent:* Monday, November 18, 2019 1:12 PM
> *To:* larry mccay <[email protected]>; [email protected] <
> [email protected]>
> *Subject:* Re: Switching user going from KNOX to WebHDFS
>
> @kevin, yes we're not using Kerberos or any AD
>
> So you're saying that whatever user I authenticate against knox is the one
> that will be passed to webhdfs?
>
> If i were to pass ?user.name=hdfs in the query string targeted for hdfs
> or ?doas=hdfs in the that request, would knox honor those and pass them
> along? or will they get overwritten by Knox?
> ------------------------------
> *From:* larry mccay <[email protected]>
> *Sent:* Monday, November 18, 2019 1:09 PM
> *To:* [email protected] <[email protected]>
> *Subject:* Re: Switching user going from KNOX to WebHDFS
>
> Hi Jeff -
>
> Thanks for reaching out!
>
> Rather than try and unpack all of that, I'd like to get to step back to a
> description of what you are trying to accomplish with your deployment and
> the addition of Knox within it.
>
> As you have described it, it seems like a very unsecured environment.
> Whether you are running your process as a root user or not, executing your
> queries and operations as the HDFS user is also very insecure.
> HDFS is a superuser in a Hadoop deployment.
>
> Authenticating to Knox as root and asserting the effective user as hdfs is
> certainly we can do but I don't see what the value is of doing that.
>
> So, let's step back and get a clear picture of what you would like to
> accomplish and we can direct you to appropriate authentication/federation
> providers and possibly identity-assertion providers to meet your needs.
>
> thanks,
>
> --larry
>
> On Mon, Nov 18, 2019 at 2:47 PM Kevin Risden <[email protected]> wrote:
>
> If i am to do an hdfs query, all i need to do is to set HADOOP_USER_NAME
> to 'hdfs' then everything works nicely.
>
>
> This means that you aren't using Kerberos just regular simple auth for
> your cluster.
>
> This is true until we get to knox. We still communicate with Knox using a
> root and an admin password. I believe by default, this user's identity is
> used to call webhdfs?
>
>
> The user identity is asserted by Knox against the backend service. So Knox
> is configured for authentication that username is asserted to the backend.
> So however you are doing authentication in Knox needs to be configured.
> This is usually LDAP out of the box but can be configured with different
> authentication providers like PAM.
>
> Kevin Risden
>
>
> On Mon, Nov 18, 2019 at 2:37 PM jeff saremi <[email protected]>
> wrote:
>
> I'm not sure how to phrase this question and also I don't have any
> experience in these two technologies
>
> Here's the deal: We are switching from running hadoop and related
> technologies from under root to a non-root user
>
> So far we have managed to successfully change our namenodes and datanodes
> such that the process is running under a user named 'hdfs'.
>
> If i am to do an hdfs query, all i need to do is to set HADOOP_USER_NAME
> to 'hdfs' then everything works nicely.
>
> This is true until we get to knox. We still communicate with Knox using a
> root and an admin password. I believe by default, this user's identity is
> used to call webhdfs?
>
> We need to change this behavior. Looking for some pointers on what the
> changes would be.
>
> thanks
> Jeff
>
>

Reply via email to