This would be fantastic to take advantage of once it's available and I agree 
that YARNs implementation would be ideal to base it off.    I'm wondering if 
there might be an interim work around anyone could think of ‎in the meantime 
though.   Would there be any way to have the task instances in the slaves call 
the UGI login with a principal/keytab provided to the driver?
From: Marcelo Vanzin
Sent: Friday, June 26, 2015 5:28 PM
To: Tim Chen
Cc: Olivier Girardot; Dave Ariens; user@spark.apache.org
Subject: Re: Accessing Kerberos Secured HDFS Resources from Spark on Mesos


On Fri, Jun 26, 2015 at 2:08 PM, Tim Chen 
<t...@mesosphere.io<mailto:t...@mesosphere.io>> wrote:
Mesos do support running containers as specific users passed to it.
Thanks for chiming in, what else does YARN do with Kerberos besides keytab file 
and user?

The basic things I'd expect from a system to properly support Kerberos would be:

- The cluster manager should authenticate users (like the YARN RM does) before 
users can start applications.
- The cluster manager should use Kerberos to authenticate within itself (e.g. a 
YARN NM connecting to the RM).
- Started applications are properly isolated (e.g. application runs as 
requesting user, or in a separate container that cannot be accessed by other 
applications in any way).

On top of that, for HDFS and other Hadoop services, the applications themselves 
need to be aware that Kerberos is enabled and that they need to do certain 
things. For example, they need to get delegation tokens for each service they 
need (Spark on YARN supports that HDFS and Hive) - you can look for uses of 
"obtainTokensForNamenodes" as an example. And those tokens need to be 
distributed to all executors securely (which you get when you enable encrypted 
RPCs on YARN).

So if Mesos handles the above cases, you could probably adapt the code in the 
YARN integration to work with Mesos too; the YARN code uses Hadoop library 
features like UserGroupInformation to propagate tokens, which is integrated 
into the YARN API itself, so there might be some extra work to make it all work 
with Mesos.

On Fri, Jun 26, 2015 at 1:20 PM, Marcelo Vanzin 
<van...@cloudera.com<mailto:van...@cloudera.com>> wrote:
On Fri, Jun 26, 2015 at 1:13 PM, Tim Chen 
<t...@mesosphere.io<mailto:t...@mesosphere.io>> wrote:
So correct me if I'm wrong, sounds like all you need is a principal user name 
and also a keytab file downloaded right?

I'm not familiar with Mesos so don't know what kinds of features it has, but at 
the very least it would need to start containers as the requesting users (like 
YARN does when running with Kerberos enabled), to avoid users being able to 
read each other's credentials.

--
Marcelo




--
Marcelo

Reply via email to