Sorry for the many typos as I was typing from my cell phone. Hope you still
can get the idea.

On Sat, Feb 7, 2015 at 1:55 PM, Chester @work <ches...@alpinenow.com> wrote:

>
>  I just implemented this in our application. The impersonation is done
> before the job is submitted. In spark yarn (we are using yarn cluster mode)
> , it just takes the current User from UserGroupInfoemation and summitted to
> yarn resource manager.
>
> If one use Kinit from command line, the who Jvm needs to has the same
> principal and you have to handle ticket expiration with cron job.
>
> If this is individual cli at hoc job, this might be ok. But if you
> intended to use an application to run spark job and end user interact with
> spark, then you need set up a service super user use that user to login to
> Kerbros KDC (Kinit equivalent) programmally, then create proxy user to
> impersonate end user. You can handle ticket expiration in code as well. So
> there is no need of cron job
>
> Certainly one can move all these logic to spark, one need to create spark
> service user principal and keytab. As part of the spark job submit , one
> can pass the principal and keytab location to the spark and spark can
> create a proxy user if the authentication is Kerberos, as well as add job
> delegation tokens
>
> I will love to contribute this if we need this in spark , as I just
> completed the Hadoop Kerberos authentication feature, It covers pig, map
> reduce , spark, sqoops as well as standard HDFS access.
>
> I will take a look at sandy's jira
>
> Chester
>
> On Feb 2, 2015, at 2:37 PM, Jim Green <openkbi...@gmail.com> wrote:
>
> Hi Team,
>
> Does spark support impersonation?
> For example, when spark on yarn/hive/hbase/etc..., which user is used by
> default?
> The user which starts the spark job?
> Any suggestions related to impersonation?
>
> --
> Thanks,
> www.openkb.info
> (Open KnowledgeBase for Hadoop/Database/OS/Network/Tool)
>
>

Reply via email to