I understand that Kerberos support for accessing Hadoop resources in Spark only 
works when running Spark on YARN.  However, I'd really like to hack something 
together for Spark on Mesos running alongside a secured Hadoop cluster.  My 
simplified appplication (gist: 
https://gist.github.com/ariens/2c44c30e064b1790146a) receives a Kerberos 
principal and keytab when submitted.  The static main method called currently 
then performs a UserGroupInformation. loginUserFromKeytab(userPrincipal, 
userKeytab) and authenticates to the Hadoop.  This works on YARN (curiously 
without even without having to kinit first), but not on Mesos.  Is there a way 
to have the slaves  running the tasks perform the same kerberos login before 
they attempt to access HDFS?



Putting aside the security of Spark/Mesos and how that keytab would get 
distributed, I'm just looking for a working POC.



Is there a way to leverage the Broadcast capability to send a function that 
performs this?



https://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.broadcast.Broadcast



Ideally, I'd love for this to not incur much overhead and just simply allow me 
to work around the absent Kerberos support...



Thanks,



Dave

Reply via email to