Re: Spark impersonation

2015-02-07 Thread Sandy Ryza
https://issues.apache.org/jira/browse/SPARK-5493 currently tracks this.

-Sandy

On Mon, Feb 2, 2015 at 9:37 PM, Zhan Zhang zzh...@hortonworks.com wrote:

  I think you can configure hadoop/hive to do impersonation.  There is no
 difference between secure or insecure hadoop cluster by using kinit.

  Thanks.

  Zhan Zhang

  On Feb 2, 2015, at 9:32 PM, Koert Kuipers ko...@tresata.com wrote:

  yes jobs run as the user that launched them.
 if you want to run jobs on a secure cluster then use yarn. hadoop
 standalone does not support secure hadoop.

 On Mon, Feb 2, 2015 at 5:37 PM, Jim Green openkbi...@gmail.com wrote:

 Hi Team,

  Does spark support impersonation?
 For example, when spark on yarn/hive/hbase/etc..., which user is used by
 default?
 The user which starts the spark job?
 Any suggestions related to impersonation?

  --
  Thanks,
 www.openkb.info
 (Open KnowledgeBase for Hadoop/Database/OS/Network/Tool)






Re: Spark impersonation

2015-02-07 Thread Chester Chen
Sorry for the many typos as I was typing from my cell phone. Hope you still
can get the idea.

On Sat, Feb 7, 2015 at 1:55 PM, Chester @work ches...@alpinenow.com wrote:


  I just implemented this in our application. The impersonation is done
 before the job is submitted. In spark yarn (we are using yarn cluster mode)
 , it just takes the current User from UserGroupInfoemation and summitted to
 yarn resource manager.

 If one use Kinit from command line, the who Jvm needs to has the same
 principal and you have to handle ticket expiration with cron job.

 If this is individual cli at hoc job, this might be ok. But if you
 intended to use an application to run spark job and end user interact with
 spark, then you need set up a service super user use that user to login to
 Kerbros KDC (Kinit equivalent) programmally, then create proxy user to
 impersonate end user. You can handle ticket expiration in code as well. So
 there is no need of cron job

 Certainly one can move all these logic to spark, one need to create spark
 service user principal and keytab. As part of the spark job submit , one
 can pass the principal and keytab location to the spark and spark can
 create a proxy user if the authentication is Kerberos, as well as add job
 delegation tokens

 I will love to contribute this if we need this in spark , as I just
 completed the Hadoop Kerberos authentication feature, It covers pig, map
 reduce , spark, sqoops as well as standard HDFS access.

 I will take a look at sandy's jira

 Chester

 On Feb 2, 2015, at 2:37 PM, Jim Green openkbi...@gmail.com wrote:

 Hi Team,

 Does spark support impersonation?
 For example, when spark on yarn/hive/hbase/etc..., which user is used by
 default?
 The user which starts the spark job?
 Any suggestions related to impersonation?

 --
 Thanks,
 www.openkb.info
 (Open KnowledgeBase for Hadoop/Database/OS/Network/Tool)




Spark impersonation

2015-02-02 Thread Jim Green
Hi Team,

Does spark support impersonation?
For example, when spark on yarn/hive/hbase/etc..., which user is used by
default?
The user which starts the spark job?
Any suggestions related to impersonation?

-- 
Thanks,
www.openkb.info
(Open KnowledgeBase for Hadoop/Database/OS/Network/Tool)


Re: Spark impersonation

2015-02-02 Thread Koert Kuipers
yes jobs run as the user that launched them.
if you want to run jobs on a secure cluster then use yarn. hadoop
standalone does not support secure hadoop.

On Mon, Feb 2, 2015 at 5:37 PM, Jim Green openkbi...@gmail.com wrote:

 Hi Team,

 Does spark support impersonation?
 For example, when spark on yarn/hive/hbase/etc..., which user is used by
 default?
 The user which starts the spark job?
 Any suggestions related to impersonation?

 --
 Thanks,
 www.openkb.info
 (Open KnowledgeBase for Hadoop/Database/OS/Network/Tool)



Re: Spark impersonation

2015-02-02 Thread Zhan Zhang
I think you can configure hadoop/hive to do impersonation.  There is no 
difference between secure or insecure hadoop cluster by using kinit.

Thanks.

Zhan Zhang

On Feb 2, 2015, at 9:32 PM, Koert Kuipers 
ko...@tresata.commailto:ko...@tresata.com wrote:

yes jobs run as the user that launched them.
if you want to run jobs on a secure cluster then use yarn. hadoop standalone 
does not support secure hadoop.

On Mon, Feb 2, 2015 at 5:37 PM, Jim Green 
openkbi...@gmail.commailto:openkbi...@gmail.com wrote:
Hi Team,

Does spark support impersonation?
For example, when spark on yarn/hive/hbase/etc..., which user is used by 
default?
The user which starts the spark job?
Any suggestions related to impersonation?

--
Thanks,
www.openkb.infohttp://www.openkb.info/
(Open KnowledgeBase for Hadoop/Database/OS/Network/Tool)