@larry<mailto:lmc...@apache.org> Thanks a lot. That was exactly what we were looking for. ________________________________ From: jeff saremi <jeffsar...@hotmail.com> Sent: Monday, November 18, 2019 2:05 PM To: larry mccay <lmc...@apache.org>; user@knox.apache.org <user@knox.apache.org> Subject: Re: Switching user going from KNOX to WebHDFS
@Larry thanks for the explanation. it explains the behavior i was seeing. I'll look into the link you sent. ________________________________ From: larry mccay <lmc...@apache.org> Sent: Monday, November 18, 2019 1:52 PM To: user@knox.apache.org <user@knox.apache.org> Subject: Re: Switching user going from KNOX to WebHDFS Knox will scrub any incoming query params that are used to potentially spoof other users. Both user.name<http://user.name> and doas/doAs will be scrubbed from the incoming request and the appropriate query param will be used for the dispatch to the backend service. You can look at the default identity-assertion [1] provider principal mapping capabilities to map an authenticated principal/username to another user known within the cluster. This will allow you to authenticate as one user and the effective user within the cluster will be the mapped user. 1. http://knox.apache.org/books/knox-1-3-0/user-guide.html#Default+Identity+Assertion+Provider On Mon, Nov 18, 2019 at 4:15 PM jeff saremi <jeffsar...@hotmail.com<mailto:jeffsar...@hotmail.com>> wrote: @Larry, yes your description is pretty well. We have different "levels" of security. This is the lowest one and most unsecure however since this is on-prem product some customers woulnd't mind accepting the risk We also have Kerberos and AD on top of these for those want a more secure environment. Currently, we need Knox to allow us pass the superuser identity if possible. thanks ________________________________ From: jeff saremi <jeffsar...@hotmail.com<mailto:jeffsar...@hotmail.com>> Sent: Monday, November 18, 2019 1:12 PM To: larry mccay <lmc...@apache.org<mailto:lmc...@apache.org>>; user@knox.apache.org<mailto:user@knox.apache.org> <user@knox.apache.org<mailto:user@knox.apache.org>> Subject: Re: Switching user going from KNOX to WebHDFS @kevin, yes we're not using Kerberos or any AD So you're saying that whatever user I authenticate against knox is the one that will be passed to webhdfs? If i were to pass ?user.name<http://user.name>=hdfs in the query string targeted for hdfs or ?doas=hdfs in the that request, would knox honor those and pass them along? or will they get overwritten by Knox? ________________________________ From: larry mccay <lmc...@apache.org<mailto:lmc...@apache.org>> Sent: Monday, November 18, 2019 1:09 PM To: user@knox.apache.org<mailto:user@knox.apache.org> <user@knox.apache.org<mailto:user@knox.apache.org>> Subject: Re: Switching user going from KNOX to WebHDFS Hi Jeff - Thanks for reaching out! Rather than try and unpack all of that, I'd like to get to step back to a description of what you are trying to accomplish with your deployment and the addition of Knox within it. As you have described it, it seems like a very unsecured environment. Whether you are running your process as a root user or not, executing your queries and operations as the HDFS user is also very insecure. HDFS is a superuser in a Hadoop deployment. Authenticating to Knox as root and asserting the effective user as hdfs is certainly we can do but I don't see what the value is of doing that. So, let's step back and get a clear picture of what you would like to accomplish and we can direct you to appropriate authentication/federation providers and possibly identity-assertion providers to meet your needs. thanks, --larry On Mon, Nov 18, 2019 at 2:47 PM Kevin Risden <kris...@apache.org<mailto:kris...@apache.org>> wrote: If i am to do an hdfs query, all i need to do is to set HADOOP_USER_NAME to 'hdfs' then everything works nicely. This means that you aren't using Kerberos just regular simple auth for your cluster. This is true until we get to knox. We still communicate with Knox using a root and an admin password. I believe by default, this user's identity is used to call webhdfs? The user identity is asserted by Knox against the backend service. So Knox is configured for authentication that username is asserted to the backend. So however you are doing authentication in Knox needs to be configured. This is usually LDAP out of the box but can be configured with different authentication providers like PAM. Kevin Risden On Mon, Nov 18, 2019 at 2:37 PM jeff saremi <jeffsar...@hotmail.com<mailto:jeffsar...@hotmail.com>> wrote: I'm not sure how to phrase this question and also I don't have any experience in these two technologies Here's the deal: We are switching from running hadoop and related technologies from under root to a non-root user So far we have managed to successfully change our namenodes and datanodes such that the process is running under a user named 'hdfs'. If i am to do an hdfs query, all i need to do is to set HADOOP_USER_NAME to 'hdfs' then everything works nicely. This is true until we get to knox. We still communicate with Knox using a root and an admin password. I believe by default, this user's identity is used to call webhdfs? We need to change this behavior. Looking for some pointers on what the changes would be. thanks Jeff