I understand that Kerberos support for accessing Hadoop resources in Spark only works when running Spark on YARN. However, I'd really like to hack something together for Spark on Mesos running alongside a secured Hadoop cluster. My simplified appplication (gist: https://gist.github.com/ariens/2c44c30e064b1790146a) receives a Kerberos principal and keytab when submitted. The static main method called currently then performs a UserGroupInformation. loginUserFromKeytab(userPrincipal, userKeytab) and authenticates to the Hadoop. This works on YARN (curiously without even without having to kinit first), but not on Mesos. Is there a way to have the slaves running the tasks perform the same kerberos login before they attempt to access HDFS?
Putting aside the security of Spark/Mesos and how that keytab would get distributed, I'm just looking for a working POC. Is there a way to leverage the Broadcast capability to send a function that performs this? https://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.broadcast.Broadcast Ideally, I'd love for this to not incur much overhead and just simply allow me to work around the absent Kerberos support... Thanks, Dave