I would pretty much need exactly this kind of feature too Le ven. 26 juin 2015 à 21:17, Dave Ariens <dari...@blackberry.com> a écrit :
> Hi Timothy, > > > > Because I'm running Spark on Mesos alongside a secured Hadoop cluster, I > need to ensure that my tasks running on the slaves perform a Kerberos login > before accessing any HDFS resources. To login, they just need the name of > the principal (username) and a keytab file. Then they just need to invoke > the following java: > > > > import org.apache.hadoop.security.UserGroupInformation > > UserGroupInformation.loginUserFromKeytab(adminPrincipal, adminKeytab) > > > > This is done in the driver in my Gist below, but I don't know how to run > it within each executor on the slaves as tasks are ran. > > > > Any help would be appreciated! > > > > > > *From:* Timothy Chen [mailto:t...@mesosphere.io] > *Sent:* Friday, June 26, 2015 12:50 PM > *To:* Dave Ariens > *Cc:* user@spark.apache.org > *Subject:* Re: Accessing Kerberos Secured HDFS Resources from Spark on > Mesos > > > > Hi Dave, > > > > I don't understand Keeberos much but if you know the exact steps that > needs to happen I can see how we can make that happen with the Spark > framework. > > > > Tim > > > On Jun 26, 2015, at 8:49 AM, Dave Ariens <dari...@blackberry.com> wrote: > > I understand that Kerberos support for accessing Hadoop resources in Spark > only works when running Spark on YARN. However, I'd really like to hack > something together for Spark on Mesos running alongside a secured Hadoop > cluster. My simplified appplication (gist: > https://gist.github.com/ariens/2c44c30e064b1790146a) receives a Kerberos > principal and keytab when submitted. The static main method called currently > then performs a UserGroupInformation. loginUserFromKeytab(userPrincipal, > userKeytab) and authenticates to the Hadoop. This works on YARN (curiously > without even without having to kinit first), but not on Mesos. Is there a > way to have the slaves running the tasks perform the same kerberos login > before they attempt to access HDFS? > > > > Putting aside the security of Spark/Mesos and how that keytab would get > distributed, I'm just looking for a working POC. > > > > Is there a way to leverage the Broadcast capability to send a function that > performs this? > > > > https://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.broadcast.Broadcast > > > > Ideally, I'd love for this to not incur much overhead and just simply allow > me to work around the absent Kerberos support... > > > > Thanks, > > > > Dave > >