So correct me if I'm wrong, sounds like all you need is a principal user name and also a keytab file downloaded right?
I'm adding support from spark framework to download additional files along side your executor and driver, and one workaround is to specify a user principal and keytab file that can be downloaded and then used in your driver as you can expect it to be in the current working directory. I suspect there might be other setup needed, but if you guys are available we can work together to get something working. Tim On Fri, Jun 26, 2015 at 12:23 PM, Olivier Girardot <ssab...@gmail.com> wrote: > I would pretty much need exactly this kind of feature too > > Le ven. 26 juin 2015 à 21:17, Dave Ariens <dari...@blackberry.com> a > écrit : > >> Hi Timothy, >> >> >> >> Because I'm running Spark on Mesos alongside a secured Hadoop cluster, I >> need to ensure that my tasks running on the slaves perform a Kerberos login >> before accessing any HDFS resources. To login, they just need the name of >> the principal (username) and a keytab file. Then they just need to invoke >> the following java: >> >> >> >> import org.apache.hadoop.security.UserGroupInformation >> >> UserGroupInformation.loginUserFromKeytab(adminPrincipal, adminKeytab) >> >> >> >> This is done in the driver in my Gist below, but I don't know how to run >> it within each executor on the slaves as tasks are ran. >> >> >> >> Any help would be appreciated! >> >> >> >> >> >> *From:* Timothy Chen [mailto:t...@mesosphere.io] >> *Sent:* Friday, June 26, 2015 12:50 PM >> *To:* Dave Ariens >> *Cc:* user@spark.apache.org >> *Subject:* Re: Accessing Kerberos Secured HDFS Resources from Spark on >> Mesos >> >> >> >> Hi Dave, >> >> >> >> I don't understand Keeberos much but if you know the exact steps that >> needs to happen I can see how we can make that happen with the Spark >> framework. >> >> >> >> Tim >> >> >> On Jun 26, 2015, at 8:49 AM, Dave Ariens <dari...@blackberry.com> wrote: >> >> I understand that Kerberos support for accessing Hadoop resources in Spark >> only works when running Spark on YARN. However, I'd really like to hack >> something together for Spark on Mesos running alongside a secured Hadoop >> cluster. My simplified appplication (gist: >> https://gist.github.com/ariens/2c44c30e064b1790146a) receives a Kerberos >> principal and keytab when submitted. The static main method called >> currently then performs a UserGroupInformation. >> loginUserFromKeytab(userPrincipal, userKeytab) and authenticates to the >> Hadoop. This works on YARN (curiously without even without having to kinit >> first), but not on Mesos. Is there a way to have the slaves running the >> tasks perform the same kerberos login before they attempt to access HDFS? >> >> >> >> Putting aside the security of Spark/Mesos and how that keytab would get >> distributed, I'm just looking for a working POC. >> >> >> >> Is there a way to leverage the Broadcast capability to send a function that >> performs this? >> >> >> >> https://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.broadcast.Broadcast >> >> >> >> Ideally, I'd love for this to not incur much overhead and just simply allow >> me to work around the absent Kerberos support... >> >> >> >> Thanks, >> >> >> >> Dave >> >>