[jira] [Comment Edited] (SPARK-16742) Kerberos support for Spark on Mesos
[ https://issues.apache.org/jira/browse/SPARK-16742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15969341#comment-15969341 ] Michael Gummelt edited comment on SPARK-16742 at 4/14/17 6:01 PM: -- [~jerryshao] No, but you can look at our solution here: https://github.com/mesosphere/spark/commit/0a2cc4248039ca989e177e96e92a594a025661fe#diff-79391110e9f26657e415aa169a004998R129 The code we upstream will be quite different, but the delegation token handling will be similar. was (Author: mgummelt): [~jerryshao] No, but you can look at our solution here: https://github.com/mesosphere/spark/commit/0a2cc4248039ca989e177e96e92a594a025661fe#diff-79391110e9f26657e415aa169a004998R129 > Kerberos support for Spark on Mesos > --- > > Key: SPARK-16742 > URL: https://issues.apache.org/jira/browse/SPARK-16742 > Project: Spark > Issue Type: New Feature > Components: Mesos >Reporter: Michael Gummelt > > We at Mesosphere have written Kerberos support for Spark on Mesos. We'll be > contributing it to Apache Spark soon. > Mesosphere design doc: > https://docs.google.com/document/d/1xyzICg7SIaugCEcB4w1vBWp24UDkyJ1Pyt2jtnREFqc/edit#heading=h.tdnq7wilqrj6 > Mesosphere code: > https://github.com/mesosphere/spark/commit/73ba2ab8d97510d5475ef9a48c673ce34f7173fa -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-16742) Kerberos support for Spark on Mesos
[ https://issues.apache.org/jira/browse/SPARK-16742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15963446#comment-15963446 ] Michael Gummelt edited comment on SPARK-16742 at 4/10/17 8:35 PM: -- [~vanzin] bq. The most basic feature needed for any kerberos-related work is user isolation (different users cannot mess with each others' processes). I was under the impression that Mesos supported that. Mesos of course supports configuring the Linux user that process runs as. But in Spark, this isn't currently derived from the Kerberos principal. It's configured by the user. The scheduler's *Mesos* principal, along with ACLs configured in Mesos, is what determines which Linux users are allowed. That's why I was asking about {{hadoop.security.auth_to_local}}, to understand how YARN determines what Linux user to run executors as. It would be a vulnerability, for example, if the Linux user for the executors is simply derived from that of the driver, because two human users running as the same Linux user, but logged in via different Kerberos principals, would be able to see each others' tokens. bq. I don't know where this notion that cluster mode requires you to distribute keytabs comes from As you said, it's mostly the renewal use case that requires distributing the keytab, but that's not all. In many Mesos setups, and certainly in DC/OS, the submitting user might not already be kinit'd. They may be running from outside the datacenter entirely, without network access to the KDC. You're right that we could implement cluster mode in some form, but I'd rather keep the initial PR small. I hope that's acceptable. was (Author: mgummelt): [~vanzin] bq. The most basic feature needed for any kerberos-related work is user isolation (different users cannot mess with each others' processes). I was under the impression that Mesos supported that. Mesos of course supports configuring the Linux user that process runs as. But in Spark, this isn't currently derived from the Kerberos principal. It's configured by the user. The scheduler's *Mesos* principal, along with ACLs configured in Mesos, is what determines which Linux users are allowed. That's why I was asking about {{hadoop.security.auth_to_local}}, to understand how YARN determines what Linux user to run executors as. It would be a vulnerability, for example, if the Linux user for the executors is simply derived from that of the driver, because two human users running as the same Linux user, but logged in via different Kerberos principals, would be able to see each others' tokens. bq. I don't know where this notion that cluster mode requires you to distribute keytabs comes from As you said, it's mostly the renewal use case that requires distributing the keytab, but that's not all. In many Mesos setups, and certainly in DC/OS, the submitting user might not already be kinit'd. They may be running from outside the datacenter entirely, without network access to the KDC. > Kerberos support for Spark on Mesos > --- > > Key: SPARK-16742 > URL: https://issues.apache.org/jira/browse/SPARK-16742 > Project: Spark > Issue Type: New Feature > Components: Mesos >Reporter: Michael Gummelt > > We at Mesosphere have written Kerberos support for Spark on Mesos. We'll be > contributing it to Apache Spark soon. > Mesosphere design doc: > https://docs.google.com/document/d/1xyzICg7SIaugCEcB4w1vBWp24UDkyJ1Pyt2jtnREFqc/edit#heading=h.tdnq7wilqrj6 > Mesosphere code: > https://github.com/mesosphere/spark/commit/73ba2ab8d97510d5475ef9a48c673ce34f7173fa -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-16742) Kerberos support for Spark on Mesos
[ https://issues.apache.org/jira/browse/SPARK-16742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15963446#comment-15963446 ] Michael Gummelt edited comment on SPARK-16742 at 4/10/17 8:20 PM: -- [~vanzin] bq. The most basic feature needed for any kerberos-related work is user isolation (different users cannot mess with each others' processes). I was under the impression that Mesos supported that. Mesos of course supports configuring the Linux user that process runs as. But in Spark, this isn't currently derived from the Kerberos principal. It's configured by the user. The scheduler's *Mesos* principal, along with ACLs configured in Mesos, is what determines which Linux users are allowed. That's why I was asking about {{hadoop.security.auth_to_local}}, to understand how YARN determines what Linux user to run executors as. It would be a vulnerability, for example, if the Linux user for the executors is simply derived from that of the driver, because two human users running as the same Linux user, but logged in via different Kerberos principals, would be able to see each others' tokens. bq. I don't know where this notion that cluster mode requires you to distribute keytabs comes from As you said, it's mostly the renewal use case that requires distributing the keytab, but that's not all. In many Mesos setups, and certainly in DC/OS, the submitting user might not already be kinit'd. They may be running from outside the datacenter entirely, without network access to the KDC. was (Author: mgummelt): [~vanzin] bq. The most basic feature needed for any kerberos-related work is user isolation (different users cannot mess with each others' processes). I was under the impression that Mesos supported that. Mesos of course supports configuring the Linux user that process runs as. But in Spark, this isn't currently derived from the Kerberos principal. It's configured by the user, and the *Mesos* principal of the scheduler, along with ACLs configured in Mesos, is what determines which Linux users are allowed. That's why I was asking about {{hadoop.security.auth_to_local}}, to understand how YARN determines what Linux user to run executors as. It would be a vulnerability, for example, if the Linux user for the executors is simply derived from that of the driver, because two human users running as the same Linux user, but logged in via different Kerberos principals, would be able to see each others' tokens. bq. I don't know where this notion that cluster mode requires you to distribute keytabs comes from As you said, it's mostly the renewal use case that requires distributing the keytab, but that's not all. In many Mesos setups, and certainly in DC/OS, the submitting user might not already be kinit'd. They may be running from outside the datacenter entirely, without network access to the KDC. > Kerberos support for Spark on Mesos > --- > > Key: SPARK-16742 > URL: https://issues.apache.org/jira/browse/SPARK-16742 > Project: Spark > Issue Type: New Feature > Components: Mesos >Reporter: Michael Gummelt > > We at Mesosphere have written Kerberos support for Spark on Mesos. We'll be > contributing it to Apache Spark soon. > Mesosphere design doc: > https://docs.google.com/document/d/1xyzICg7SIaugCEcB4w1vBWp24UDkyJ1Pyt2jtnREFqc/edit#heading=h.tdnq7wilqrj6 > Mesosphere code: > https://github.com/mesosphere/spark/commit/73ba2ab8d97510d5475ef9a48c673ce34f7173fa -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-16742) Kerberos support for Spark on Mesos
[ https://issues.apache.org/jira/browse/SPARK-16742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15963446#comment-15963446 ] Michael Gummelt edited comment on SPARK-16742 at 4/10/17 8:18 PM: -- [~vanzin] bq. The most basic feature needed for any kerberos-related work is user isolation (different users cannot mess with each others' processes). I was under the impression that Mesos supported that. Mesos of course supports configuring the Linux user that process runs as. But in Spark, this isn't currently derived from the Kerberos principal. It's configured by the user, and the *Mesos* principal of the scheduler, along with ACLs configured in Mesos, is what determines which Linux users are allowed. That's why I was asking about {{hadoop.security.auth_to_local}}, to understand how YARN determines what Linux user to run executors as. It would be a vulnerability, for example, if the Linux user for the executors is simply derived from that of the driver, because two human users running as the same Linux user, but logged in via different Kerberos principals, would be able to see each others' tokens. bq. I don't know where this notion that cluster mode requires you to distribute keytabs comes from As you said, it's mostly the renewal use case that requires distributing the keytab, but that's not all. In many Mesos setups, and certainly in DC/OS, the submitting user might not already be kinit'd. They may be running from outside the datacenter entirely, without network access to the KDC. was (Author: mgummelt): bq. The most basic feature needed for any kerberos-related work is user isolation (different users cannot mess with each others' processes). I was under the impression that Mesos supported that. Mesos of course supports configuring the Linux user that process runs as. But in Spark, this isn't currently derived from the Kerberos principal. It's configured by the user, and the *Mesos* principal of the scheduler, along with ACLs configured in Mesos, is what determines which Linux users are allowed. That's why I was asking about {{hadoop.security.auth_to_local}}, to understand how YARN determines what Linux user to run executors as. It would be a vulnerability, for example, if the Linux user for the executors is simply derived from that of the driver, because two human users running as the same Linux user, but logged in via different Kerberos principals, would be able to see each others' tokens. bq. I don't know where this notion that cluster mode requires you to distribute keytabs comes from As you said, it's mostly the renewal use case that requires distributing the keytab, but that's not all. In many Mesos setups, and certainly in DC/OS, the submitting user might not already be kinit'd. They may be running from outside the datacenter entirely, without network access to the KDC. > Kerberos support for Spark on Mesos > --- > > Key: SPARK-16742 > URL: https://issues.apache.org/jira/browse/SPARK-16742 > Project: Spark > Issue Type: New Feature > Components: Mesos >Reporter: Michael Gummelt > > We at Mesosphere have written Kerberos support for Spark on Mesos. We'll be > contributing it to Apache Spark soon. > Mesosphere design doc: > https://docs.google.com/document/d/1xyzICg7SIaugCEcB4w1vBWp24UDkyJ1Pyt2jtnREFqc/edit#heading=h.tdnq7wilqrj6 > Mesosphere code: > https://github.com/mesosphere/spark/commit/73ba2ab8d97510d5475ef9a48c673ce34f7173fa -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-16742) Kerberos support for Spark on Mesos
[ https://issues.apache.org/jira/browse/SPARK-16742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15962440#comment-15962440 ] Michael Gummelt edited comment on SPARK-16742 at 4/10/17 5:28 AM: -- Hi [~vanzin], [~ganger85] and Strat.io are pulling back their Mesos Kerberos implementation for now, and we at Mesosphere are about to submit a PR to upstream our implementation. I have a few questions I'd like to run by you to make sure that PR goes smoothly. 1) I've been following your comments on this Spark Standalone Kerberos PR: https://github.com/apache/spark/pull/17530. It looks like your concern is that in *cluster mode*, the keytab is written to a file on the host running the driver, and is owned by the user of the Spark Worker, which will be the same for each job. So jobs submitted by multiple users will be able to read each other's keytabs. In *client mode*, it looks like the delegation tokens are written to a file (HADOOP_TOKEN_FILE_LOCATION) on the host running the executor, which suffers from the same problem as the keytab in cluster mode. The problem is then that a kerberos-authenticated user submitting their job would be unaware that their credentials are being leaked to other users. Is this an accurate description of the issue? 2) I understand that YARN writes delegation tokens via {{amContainer.setTokens()}}, which ultimately results in the delegation token being written to a file owned by the submitting user. However, since the "submitting user" is a Kerberos user, not a Unix user, I'm assuming that {{hadoop.security.auth_to_local}} is what maps the Kerberos user to the Unix user who runs the ApplicationMaster and owns that file. Is that correct? To avoid the shared-file problem for delegation tokens, our Mesos implementation currently has the Executor issue an RPC call to fetch the delegation token from the driver. There therefore isn't any need for at-rest access control, and if in-motion interception is in the user's threat model, then can be sure to run Spark with SSL. We avoid the shared-file problem for keytabs entirely, because there's no need to distribute the keytab, at least in client mode. Unlike YARN, the driver and the equivalent of the "ApplicationMaster" in Mesos are one and the same. They both exist in the same process, the {{spark-submit}} process. We're probably going to punt on cluster mode for now, just for simplicity, but we should be able to solve this in cluster mode as well, because unlike standalone, and much like YARN, Mesos controls what user the driver runs as. What do you think of the above approach? If you see any blockers, I would very much appreciate teasing those out now rather than during the PR. Thanks! was (Author: mgummelt): Hi [~vanzin], [~ganger85] and Strat.io are pulling back their Mesos Kerberos implementation for now, and we at Mesosphere are about to submit a PR to upstream our implementation. I have a few questions I'd like to run by you to make sure that PR goes smoothly. 1) I've been following your comments on this Spark Standalone Kerberos PR: https://github.com/apache/spark/pull/17530. It looks like your concern is that in *cluster mode*, the keytab is written to a file on the host running the driver, and is owned by the user of the Spark Worker, which will be the same for each job. So jobs submitted by multiple users will be able to read each other's keytabs. In *client mode*, it looks like the delegation tokens are written to a file (HADOOP_TOKEN_FILE_LOCATION) on the host running the executor, which suffers from the same problem as the keytab in cluster mode. The problem is then that a kerberos-authenticated user submitting their job would be unaware that their credentials are being leaked to other users. Is this an accurate description of the issue? 2) I understand that YARN writes delegation tokens via {{amContainer.setTokens()}}, which ultimately results in the delegation token being written to a file owned by the submitting user. However, since the "submitting user" is a Kerberos user, not a Unix user, I'm assuming that {{hadoop.security.auth_to_local}} is what maps the Kerberos user to the Unix user who runs the ApplicationMaster and owns that file. Is that correct? To avoid the shared-file problem for delegation tokens, our Mesos implementation currently has the Executor issue an RPC call to fetch the delegation token from the driver. There therefore isn't any need for at-rest encryption, and if in-motion encryption is in the user's threat model, then can be sure to run Spark with SSL. We avoid the shared-file problem for keytabs entirely, because there's no need to distribute the keytab, at least in client mode. Unlike YARN, the driver and the equivalent of the "ApplicationMaster" in Mesos are one and the same. They both exist in the same process,
[jira] [Comment Edited] (SPARK-16742) Kerberos support for Spark on Mesos
[ https://issues.apache.org/jira/browse/SPARK-16742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15863371#comment-15863371 ] Saisai Shao edited comment on SPARK-16742 at 2/13/17 9:34 AM: -- The proposed solution is quite different from what existed in Spark on YARN. IIUC this solution looks doesn't honor delegation token, and wraps every HDFS operation with {{executeSecure}}, I simply doubt that this approach requires other components, like sql, streaming, should also know the existence of such APIs and try to wrap them. Also if newly added codes ignore this wrapper, this will lead to error. From my understanding it is quite intrusive. Also how do you handle principal and keytab for driver/executors, do you need to ship keytab to every nodes and who is responsible for this? And looks from your PR what you mainly focused is user impersonation, this is slightly different from what this JIRA mentioned about, also your main requirement is dynamic proxy user change, I would suggest to use another JIRA to track this, since this is a little different from support Kerberos in Mesos. was (Author: jerryshao): The proposed solution is quite different from what existed in Spark on YARN. IIUC this solution looks doesn't honor delegation token, and wraps every HDFS operation with {{executeSecure}}, I simply doubt that this approach requires other components, like sql, streaming, should also know the existence of such APIs and try to wrap them. Also if newly added codes ignore this wrapper, this will lead to error. From my understanding it is quite intrusive. > Kerberos support for Spark on Mesos > --- > > Key: SPARK-16742 > URL: https://issues.apache.org/jira/browse/SPARK-16742 > Project: Spark > Issue Type: New Feature > Components: Mesos >Reporter: Michael Gummelt > > We at Mesosphere have written Kerberos support for Spark on Mesos. We'll be > contributing it to Apache Spark soon. > Mesosphere design doc: > https://docs.google.com/document/d/1xyzICg7SIaugCEcB4w1vBWp24UDkyJ1Pyt2jtnREFqc/edit#heading=h.tdnq7wilqrj6 > Mesosphere code: > https://github.com/mesosphere/spark/commit/73ba2ab8d97510d5475ef9a48c673ce34f7173fa -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org