Tony, If you are doing a server app that interacts with the cluster on behalf of different users (like Ooize, as you mentioned in your email), then you should use the proxyuser capabilities of Hadoop.
* Configure user MYSERVERUSER as proxyuser in Hadoop core-site.xml (this requires 2 properties settings, HOSTS and GROUPS). * Run your server app as MYSERVERUSER and have a Kerberos principal MYSERVERUSER/MYSERVERHOST * Initialize your server app loading the MYSERVERUSER/MYSERVERHOST keytab * Use the UGI.doAs() to create JobClient/Filesystem instances using the user you want to do something on behalf * Keep in mind that all the users you need to do something on behalf should be valid Unix users in the cluster * If those users need direct access to the cluster, they'll have to be also defined in in the KDC user database. Hope this helps. Thx On Mon, Jul 2, 2012 at 6:22 AM, Tony Dean <tony.d...@sas.com> wrote: > Yes, but this will not work in a multi-tenant environment. I need to be able > to create a Kerberos TGT per execution thread. > > I was hoping through JAAS that I could inject the name of the current > principal and authenticate against it. I'm sure there is a best practice for > hadoop/hbase client API authentication, just not sure what it is. > > Thank you for your comment. The solution may well be associated with the > UserGroupInformation class. Hopefully, other ideas will come from this > thread. > > Thanks. > > -Tony > > -----Original Message----- > From: Ivan Frain [mailto:ivan.fr...@gmail.com] > Sent: Monday, July 02, 2012 8:14 AM > To: common-user@hadoop.apache.org > Subject: Re: hadoop security API (repost) > > Hi Tony, > > I am currently working on this to access HDFS securely and programmaticaly. > What I have found so far may help even if I am not 100% sure this is the > right way to proceed. > > If you have already obtained a TGT from the kinit command, hadoop library > will locate it "automatically" if the name of the ticket cache corresponds to > default location. On Linux it is located /tmp/krb5cc_uid-number. > > For example, with my linux user hdfs, I get a TGT for hadoop user 'ivan' > meaning you can impersonate ivan from hdfs linux user: > ------------------------------------------ > hdfs@mitkdc:~$ klist > Ticket cache: FILE:/tmp/krb5cc_10003 > Default principal: i...@hadoop.lan > > Valid starting Expires Service principal > 02/07/2012 13:59 02/07/2012 23:59 krbtgt/hadoop....@hadoop.lan renew until > 03/07/2012 13:59 > ------------------------------------------- > > Then, you just have to set the right security options in your hadoop client > in java and the identity will be i...@hadoop.lan for our example. In my > tests, I only use HDFS and here a snippet of code to have access to a secure > hdfs cluster assuming the previous TGT (ivan's impersonation): > > -------------------------------------------- > val conf: HdfsConfiguration = new HdfsConfiguration() > conf.set(CommonConfigurationKeysPublic.HADOOP_SECURITY_AUTHENTICATION, > "kerberos") > conf.set(CommonConfigurationKeysPublic.HADOOP_SECURITY_AUTHORIZATION, > "true") > conf.set(DFSConfigKeys.DFS_NAMENODE_USER_NAME_KEY, serverPrincipal) > > UserGroupInformation.setConfiguration(conf) > > val fs = FileSystem.get(new URI(hdfsUri), conf) > -------------------------------------------- > > Using this 'fs' is a handler to access hdfs securely as user 'ivan' even if > ivan does not appear in the hadoop client code. > > Anyway, I also see two other options: > * Setting the KRB5CCNAME environment variable to point to the right > ticketCache file > * Specifying the keytab file you want to use from the UserGroupInformation > singleton API: > UserGroupInformation.loginUserFromKeytab(user, keytabFile) > > If you want to understand the auth process and the different options to > login, I guess you need to have a look to the UserGroupInformation.java > source code (release 0.23.1 link: http://bit.ly/NVzBKL). The private class > HadoopConfiguration line 347 is of major interest in our case. > > Another point is that I did not find any easy way to prompt the user for a > password at runtim using the actual hadoop API. It appears to be somehow > hardcoded in the UserGroupInformation singleton. I guess it could be nice to > have a new function to give to the UserGroupInformation an authenticated > 'Subject' which could override all default configurations. If someone have > better ideas it could be nice to discuss on it as well. > > > BR, > Ivan > > 2012/7/1 Tony Dean <tony.d...@sas.com> > >> Hi, >> >> The security documentation specifies how to test a secure cluster by >> using kinit and thus adding the Kerberos principal TGT to the ticket >> cache in which the hadoop client code uses to acquire service tickets >> for use in the cluster. >> What if I created an application that used the hadoop API to >> communicate with hdfs and/or mapred protocols, is there a programmatic >> way to inform hadoop to use a particular Kerberos principal name with >> a keytab that contains its password key? I didn't see a way to >> integrate with JAAS KrbLoginModule. >> I was thinking that if I could inject a callbackHandler, I could pass >> the principal name and the KrbLoginModule already has options to >> specify keytab. >> Is this something that is possible? Or is this just not the right way >> to do things? >> >> I read about impersonation where authentication is performed with a >> system user such as "oozie" and then it just impersonates other users >> so that permissions are based on the impersonated user instead of the >> system user. >> >> Please help me understand my options for executing hadoop tasks in a >> multi-tenant application. >> >> Thank you! >> >> >> > > > -- > Ivan Frain > 11, route de Grenade > 31530 Saint-Paul-sur-Save > mobile: +33 (0)6 52 52 47 07 > -- Alejandro