So correct me if I'm wrong, sounds like all you need is a principal user
name and also a keytab file downloaded right?

I'm adding support from spark framework to download additional files along
side your executor and driver, and one workaround is to specify a user
principal and keytab file that can be downloaded and then used in your
driver as you can expect it to be in the current working directory.

I suspect there might be other setup needed, but if you guys are available
we can work together to get something working.


Tim

On Fri, Jun 26, 2015 at 12:23 PM, Olivier Girardot <ssab...@gmail.com>
wrote:

> I would pretty much need exactly this kind of feature too
>
> Le ven. 26 juin 2015 à 21:17, Dave Ariens <dari...@blackberry.com> a
> écrit :
>
>>  Hi Timothy,
>>
>>
>>
>> Because I'm running Spark on Mesos alongside a secured Hadoop cluster, I
>> need to ensure that my tasks running on the slaves perform a Kerberos login
>> before accessing any HDFS resources.  To login, they just need the name of
>> the principal (username) and a keytab file.  Then they just need to invoke
>> the following java:
>>
>>
>>
>> import org.apache.hadoop.security.UserGroupInformation
>>
>> UserGroupInformation.loginUserFromKeytab(adminPrincipal, adminKeytab)
>>
>>
>>
>> This is done in the driver in my Gist below, but I don't know how to run
>> it within each executor on the slaves as tasks are ran.
>>
>>
>>
>> Any help would be appreciated!
>>
>>
>>
>>
>>
>> *From:* Timothy Chen [mailto:t...@mesosphere.io]
>> *Sent:* Friday, June 26, 2015 12:50 PM
>> *To:* Dave Ariens
>> *Cc:* user@spark.apache.org
>> *Subject:* Re: Accessing Kerberos Secured HDFS Resources from Spark on
>> Mesos
>>
>>
>>
>> Hi Dave,
>>
>>
>>
>> I don't understand Keeberos much but if you know the exact steps that
>> needs to happen I can see how we can make that happen with the Spark
>> framework.
>>
>>
>>
>> Tim
>>
>>
>> On Jun 26, 2015, at 8:49 AM, Dave Ariens <dari...@blackberry.com> wrote:
>>
>>  I understand that Kerberos support for accessing Hadoop resources in Spark 
>> only works when running Spark on YARN.  However, I'd really like to hack 
>> something together for Spark on Mesos running alongside a secured Hadoop 
>> cluster.  My simplified appplication (gist: 
>> https://gist.github.com/ariens/2c44c30e064b1790146a) receives a Kerberos 
>> principal and keytab when submitted.  The static main method called 
>> currently then performs a UserGroupInformation. 
>> loginUserFromKeytab(userPrincipal, userKeytab) and authenticates to the 
>> Hadoop.  This works on YARN (curiously without even without having to kinit 
>> first), but not on Mesos.  Is there a way to have the slaves  running the 
>> tasks perform the same kerberos login before they attempt to access HDFS?
>>
>>
>>
>> Putting aside the security of Spark/Mesos and how that keytab would get 
>> distributed, I'm just looking for a working POC.
>>
>>
>>
>> Is there a way to leverage the Broadcast capability to send a function that 
>> performs this?
>>
>>
>>
>> https://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.broadcast.Broadcast
>>
>>
>>
>> Ideally, I'd love for this to not incur much overhead and just simply allow 
>> me to work around the absent Kerberos support...
>>
>>
>>
>> Thanks,
>>
>>
>>
>> Dave
>>
>>

Reply via email to