Sorry for the many typos as I was typing from my cell phone. Hope you still can get the idea.
On Sat, Feb 7, 2015 at 1:55 PM, Chester @work <ches...@alpinenow.com> wrote: > > I just implemented this in our application. The impersonation is done > before the job is submitted. In spark yarn (we are using yarn cluster mode) > , it just takes the current User from UserGroupInfoemation and summitted to > yarn resource manager. > > If one use Kinit from command line, the who Jvm needs to has the same > principal and you have to handle ticket expiration with cron job. > > If this is individual cli at hoc job, this might be ok. But if you > intended to use an application to run spark job and end user interact with > spark, then you need set up a service super user use that user to login to > Kerbros KDC (Kinit equivalent) programmally, then create proxy user to > impersonate end user. You can handle ticket expiration in code as well. So > there is no need of cron job > > Certainly one can move all these logic to spark, one need to create spark > service user principal and keytab. As part of the spark job submit , one > can pass the principal and keytab location to the spark and spark can > create a proxy user if the authentication is Kerberos, as well as add job > delegation tokens > > I will love to contribute this if we need this in spark , as I just > completed the Hadoop Kerberos authentication feature, It covers pig, map > reduce , spark, sqoops as well as standard HDFS access. > > I will take a look at sandy's jira > > Chester > > On Feb 2, 2015, at 2:37 PM, Jim Green <openkbi...@gmail.com> wrote: > > Hi Team, > > Does spark support impersonation? > For example, when spark on yarn/hive/hbase/etc..., which user is used by > default? > The user which starts the spark job? > Any suggestions related to impersonation? > > -- > Thanks, > www.openkb.info > (Open KnowledgeBase for Hadoop/Database/OS/Network/Tool) > >