Tony,

If you are doing a server app that interacts with the cluster on
behalf of different users (like Ooize, as you mentioned in your
email), then you should use the proxyuser capabilities of Hadoop.

* Configure user MYSERVERUSER as proxyuser in Hadoop core-site.xml
(this requires 2 properties settings, HOSTS and GROUPS).
* Run your server app as MYSERVERUSER and have a Kerberos principal
MYSERVERUSER/MYSERVERHOST
* Initialize your server app loading the MYSERVERUSER/MYSERVERHOST keytab
* Use the UGI.doAs() to create JobClient/Filesystem instances using
the user you want to do something on behalf
* Keep in mind that all the users you need to do something on behalf
should be valid Unix users in the cluster
* If those users need direct access to the cluster, they'll have to be
also defined in in the KDC user database.

Hope this helps.

Thx

On Mon, Jul 2, 2012 at 6:22 AM, Tony Dean <tony.d...@sas.com> wrote:
> Yes, but this will not work in a multi-tenant environment.  I need to be able 
> to create a Kerberos TGT per execution thread.
>
> I was hoping through JAAS that I could inject the name of the current 
> principal and authenticate against it.  I'm sure there is a best practice for 
> hadoop/hbase client API authentication, just not sure what it is.
>
> Thank you for your comment.  The solution may well be associated with the 
> UserGroupInformation class.  Hopefully, other ideas will come from this 
> thread.
>
> Thanks.
>
> -Tony
>
> -----Original Message-----
> From: Ivan Frain [mailto:ivan.fr...@gmail.com]
> Sent: Monday, July 02, 2012 8:14 AM
> To: common-user@hadoop.apache.org
> Subject: Re: hadoop security API (repost)
>
> Hi Tony,
>
> I am currently working on this to access HDFS securely and programmaticaly.
> What I have found so far may help even if I am not 100% sure this is the 
> right way to proceed.
>
> If you have already obtained a TGT from the kinit command, hadoop library 
> will locate it "automatically" if the name of the ticket cache corresponds to 
> default location. On Linux it is located /tmp/krb5cc_uid-number.
>
> For example, with my linux user hdfs, I get a TGT for hadoop user 'ivan'
> meaning you can impersonate ivan from hdfs linux user:
> ------------------------------------------
> hdfs@mitkdc:~$ klist
> Ticket cache: FILE:/tmp/krb5cc_10003
> Default principal: i...@hadoop.lan
>
> Valid starting    Expires           Service principal
> 02/07/2012 13:59  02/07/2012 23:59  krbtgt/hadoop....@hadoop.lan renew until 
> 03/07/2012 13:59
> -------------------------------------------
>
> Then, you just have to set the right security options in your hadoop client 
> in java and the identity will be i...@hadoop.lan for our example. In my 
> tests, I only use HDFS and here a snippet of code to have access to a secure 
> hdfs cluster assuming the previous TGT (ivan's impersonation):
>
> --------------------------------------------
>      val conf: HdfsConfiguration = new HdfsConfiguration()
>      conf.set(CommonConfigurationKeysPublic.HADOOP_SECURITY_AUTHENTICATION,
> "kerberos")
>      conf.set(CommonConfigurationKeysPublic.HADOOP_SECURITY_AUTHORIZATION,
> "true")
>      conf.set(DFSConfigKeys.DFS_NAMENODE_USER_NAME_KEY, serverPrincipal)
>
>      UserGroupInformation.setConfiguration(conf)
>
>      val fs = FileSystem.get(new URI(hdfsUri), conf)
> --------------------------------------------
>
> Using this 'fs' is a handler to access hdfs securely as user 'ivan' even if 
> ivan does not appear in the hadoop client code.
>
> Anyway, I also see two other options:
>   * Setting the KRB5CCNAME environment variable to point to the right 
> ticketCache file
>   * Specifying the keytab file you want to use from the UserGroupInformation 
> singleton API:
> UserGroupInformation.loginUserFromKeytab(user, keytabFile)
>
> If you want to understand the auth process and the different options to 
> login, I guess you need to have a look to the UserGroupInformation.java 
> source code (release 0.23.1 link: http://bit.ly/NVzBKL). The private class 
> HadoopConfiguration line 347 is of major interest in our case.
>
> Another point is that I did not find any easy way to prompt the user for a 
> password at runtim using the actual hadoop API. It appears to be somehow 
> hardcoded in the UserGroupInformation singleton. I guess it could be nice to 
> have a new function to give to the UserGroupInformation an authenticated 
> 'Subject' which could override all default configurations. If someone have 
> better ideas it could be nice to discuss on it as well.
>
>
> BR,
> Ivan
>
> 2012/7/1 Tony Dean <tony.d...@sas.com>
>
>> Hi,
>>
>> The security documentation specifies how to test a secure cluster by
>> using kinit and thus adding the Kerberos principal TGT to the ticket
>> cache in which the hadoop client code uses to acquire service tickets
>> for use in the cluster.
>> What if I created an application that used the hadoop API to
>> communicate with hdfs and/or mapred protocols, is there a programmatic
>> way to inform hadoop to use a particular Kerberos principal name with
>> a keytab that contains its password key?  I didn't see a way to
>> integrate with JAAS KrbLoginModule.
>> I was thinking that if I could inject a callbackHandler, I could pass
>> the principal name and the KrbLoginModule already has options to
>> specify keytab.
>> Is this something that is possible?  Or is this just not the right way
>> to do things?
>>
>> I read about impersonation where authentication is performed with a
>> system user such as "oozie" and then it just impersonates other users
>> so that permissions are based on the impersonated user instead of the
>> system user.
>>
>> Please help me understand my options for executing hadoop tasks in a
>> multi-tenant application.
>>
>> Thank you!
>>
>>
>>
>
>
> --
> Ivan Frain
> 11, route de Grenade
> 31530 Saint-Paul-sur-Save
> mobile: +33 (0)6 52 52 47 07
>



-- 
Alejandro

Reply via email to