Hi Dave, I don't understand Keeberos much but if you know the exact steps that needs to happen I can see how we can make that happen with the Spark framework.
Tim > On Jun 26, 2015, at 8:49 AM, Dave Ariens <dari...@blackberry.com> wrote: > > I understand that Kerberos support for accessing Hadoop resources in Spark > only works when running Spark on YARN. However, I'd really like to hack > something together for Spark on Mesos running alongside a secured Hadoop > cluster. My simplified appplication (gist: > https://gist.github.com/ariens/2c44c30e064b1790146a) receives a Kerberos > principal and keytab when submitted. The static main method called currently > then performs a UserGroupInformation. loginUserFromKeytab(userPrincipal, > userKeytab) and authenticates to the Hadoop. This works on YARN (curiously > without even without having to kinit first), but not on Mesos. Is there a > way to have the slaves running the tasks perform the same kerberos login > before they attempt to access HDFS? > > Putting aside the security of Spark/Mesos and how that keytab would get > distributed, I'm just looking for a working POC. > > Is there a way to leverage the Broadcast capability to send a function that > performs this? > > https://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.broadcast.Broadcast > > Ideally, I'd love for this to not incur much overhead and just simply allow > me to work around the absent Kerberos support... > > Thanks, > > Dave