There's a few security related issues that I am postponing dealing with.   Once 
I get this working I'll look at the security side.   Likely I'll be encouraging 
users to submit their jobs via docker containers.   Regardless, getting the 
users keytab and principal name in the working environment of the executor 
isn't hard, it's being able to call the login method before the HDFS resources 
are accessed.

See the gist below.   That login completes successfully but it's only on the 
driver.   Once that HDFS resource is read with the Avro input format and key 
and the tasks are created inherently on the slaves they are reading from that 
HDFS resource within their own running environment (JVM?) and ‎any file system 
instantiations performed by spark aren't by a UserGroupInformation resource 
associated to the principal.

From: Marcelo Vanzin
Sent: Friday, June 26, 2015 4:20 PM
To: Tim Chen
Cc: Olivier Girardot; Dave Ariens; user@spark.apache.org
Subject: Re: Accessing Kerberos Secured HDFS Resources from Spark on Mesos


On Fri, Jun 26, 2015 at 1:13 PM, Tim Chen 
<t...@mesosphere.io<mailto:t...@mesosphere.io>> wrote:
So correct me if I'm wrong, sounds like all you need is a principal user name 
and also a keytab file downloaded right?

I'm not familiar with Mesos so don't know what kinds of features it has, but at 
the very least it would need to start containers as the requesting users (like 
YARN does when running with Kerberos enabled), to avoid users being able to 
read each other's credentials.

--
Marcelo

Reply via email to