[ 
https://issues.apache.org/jira/browse/SPARK-16742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15963446#comment-15963446
 ] 

Michael Gummelt edited comment on SPARK-16742 at 4/10/17 8:18 PM:
------------------------------------------------------------------

[~vanzin]

bq. The most basic feature needed for any kerberos-related work is user 
isolation (different users cannot mess with each others' processes). I was 
under the impression that Mesos supported that.

Mesos of course supports configuring the Linux user that process runs as.  But 
in Spark, this isn't currently derived from the Kerberos principal.  It's 
configured by the user, and the *Mesos* principal of the scheduler, along with 
ACLs configured in Mesos, is what determines which Linux users are allowed.  
That's why I was asking about {{hadoop.security.auth_to_local}}, to understand 
how YARN determines what Linux user to run executors as.  It would be a 
vulnerability, for example, if the Linux user for the executors is simply 
derived from that of the driver, because two human users running as the same 
Linux user, but logged in via different Kerberos principals, would be able to 
see each others' tokens.

bq. I don't know where this notion that cluster mode requires you to distribute 
keytabs comes from

As you said, it's mostly the renewal use case that requires distributing the 
keytab, but that's not all.  In many Mesos setups, and certainly in DC/OS, the 
submitting user might not already be kinit'd.  They may be running from outside 
the datacenter entirely, without network access to the KDC.


was (Author: mgummelt):
bq. The most basic feature needed for any kerberos-related work is user 
isolation (different users cannot mess with each others' processes). I was 
under the impression that Mesos supported that.

Mesos of course supports configuring the Linux user that process runs as.  But 
in Spark, this isn't currently derived from the Kerberos principal.  It's 
configured by the user, and the *Mesos* principal of the scheduler, along with 
ACLs configured in Mesos, is what determines which Linux users are allowed.  
That's why I was asking about {{hadoop.security.auth_to_local}}, to understand 
how YARN determines what Linux user to run executors as.  It would be a 
vulnerability, for example, if the Linux user for the executors is simply 
derived from that of the driver, because two human users running as the same 
Linux user, but logged in via different Kerberos principals, would be able to 
see each others' tokens.

bq. I don't know where this notion that cluster mode requires you to distribute 
keytabs comes from

As you said, it's mostly the renewal use case that requires distributing the 
keytab, but that's not all.  In many Mesos setups, and certainly in DC/OS, the 
submitting user might not already be kinit'd.  They may be running from outside 
the datacenter entirely, without network access to the KDC.

> Kerberos support for Spark on Mesos
> -----------------------------------
>
>                 Key: SPARK-16742
>                 URL: https://issues.apache.org/jira/browse/SPARK-16742
>             Project: Spark
>          Issue Type: New Feature
>          Components: Mesos
>            Reporter: Michael Gummelt
>
> We at Mesosphere have written Kerberos support for Spark on Mesos.  We'll be 
> contributing it to Apache Spark soon.
> Mesosphere design doc: 
> https://docs.google.com/document/d/1xyzICg7SIaugCEcB4w1vBWp24UDkyJ1Pyt2jtnREFqc/edit#heading=h.tdnq7wilqrj6
> Mesosphere code: 
> https://github.com/mesosphere/spark/commit/73ba2ab8d97510d5475ef9a48c673ce34f7173fa



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to