Re: sanboxing spark executors
Not that easy of a problem to solve… Can you impersonate the user who provided the code? I mean if Joe provides the lambda function, then it runs as Joe so it has joe’s permissions. Steve is right, you’d have to get down to your cluster’s security and authenticate the user before accepting the lambda code. You may also want to run with a restricted subset of permissions. (e.g. Joe is an admin, but he wants it to run as if its an untrusted user… this gets a bit more interesting.) And this beg’s the question… How are you sharing your RDDs across multiple users? This too opens up a security question or two… > On Nov 4, 2016, at 6:13 PM, blazespinnaker wrote: > > In particular, we need to make sure the RDDs execute the lambda functions > securely as they are provided by user code. > > > > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/sanboxing-spark-executors-tp28014p28024.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > - > To unsubscribe e-mail: user-unsubscr...@spark.apache.org >
Re: sanboxing spark executors
Mesos will let you run in docker containers, so you get filesystem isolation, and we're about to merge CNI support: https://github.com/apache/spark/pull/15740, which would allow you to set up network policies. Though you might be able to achieve whatever network isolation you need without CNI, depending on your requirements. As far as unauthenticated HDFS clusters, I would recommend against running untrusted code on the same network as your secure HDFS cluster. On Fri, Nov 4, 2016 at 4:13 PM, blazespinnaker wrote: > In particular, we need to make sure the RDDs execute the lambda functions > securely as they are provided by user code. > > > > -- > View this message in context: http://apache-spark-user-list. > 1001560.n3.nabble.com/sanboxing-spark-executors-tp28014p28024.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > - > To unsubscribe e-mail: user-unsubscr...@spark.apache.org > > -- Michael Gummelt Software Engineer Mesosphere
Re: sanboxing spark executors
In particular, we need to make sure the RDDs execute the lambda functions securely as they are provided by user code. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/sanboxing-spark-executors-tp28014p28024.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe e-mail: user-unsubscr...@spark.apache.org
Re: sanboxing spark executors
Hi, If you are using the latest Alluxio release (1.3.0), authorization is enabled, preventing users from accessing data they do not have permissions to. For older versions, you will need to enable the security flag. The documentation on security <http://www.alluxio.org/docs/master/en/Security.html> has more details. Hope this helps, Calvin On Fri, Nov 4, 2016 at 6:31 AM, Andrew Holway < andrew.hol...@otternetworks.de> wrote: > I think running it on a Mesos cluster could give you better control over > this kinda stuff. > > > On Fri, Nov 4, 2016 at 7:41 AM, blazespinnaker > wrote: > >> Is there a good method / discussion / documentation on how to sandbox a >> spark >> executor? Assume the code is untrusted and you don't want it to be able >> to >> make un validated network connections or do unvalidated alluxio/hdfs/file >> io. >> >> >> >> >> -- >> View this message in context: http://apache-spark-user-list. >> 1001560.n3.nabble.com/sanboxing-spark-executors-tp28014.html >> Sent from the Apache Spark User List mailing list archive at Nabble.com. >> >> - >> To unsubscribe e-mail: user-unsubscr...@spark.apache.org >> >> > > > -- > Otter Networks UG > http://otternetworks.de > Gotenstraße 17 > 10829 Berlin >
Re: sanboxing spark executors
I think running it on a Mesos cluster could give you better control over this kinda stuff. On Fri, Nov 4, 2016 at 7:41 AM, blazespinnaker wrote: > Is there a good method / discussion / documentation on how to sandbox a > spark > executor? Assume the code is untrusted and you don't want it to be able > to > make un validated network connections or do unvalidated alluxio/hdfs/file > io. > > > > > -- > View this message in context: http://apache-spark-user-list. > 1001560.n3.nabble.com/sanboxing-spark-executors-tp28014.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > - > To unsubscribe e-mail: user-unsubscr...@spark.apache.org > > -- Otter Networks UG http://otternetworks.de Gotenstraße 17 10829 Berlin
Re: sanboxing spark executors
> On 4 Nov 2016, at 06:41, blazespinnaker wrote: > > Is there a good method / discussion / documentation on how to sandbox a spark > executor? Assume the code is untrusted and you don't want it to be able to > make un validated network connections or do unvalidated alluxio/hdfs/file use Kerberos to auth HDFS connections, HBase, Hive. When enabled spark processes (all yarn processes) will run as different users in the cluster for isolation there too. no easy way to monitor/block general outbound network connections though. > io. > > > > > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/sanboxing-spark-executors-tp28014.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > - > To unsubscribe e-mail: user-unsubscr...@spark.apache.org > > - To unsubscribe e-mail: user-unsubscr...@spark.apache.org
sanboxing spark executors
Is there a good method / discussion / documentation on how to sandbox a spark executor? Assume the code is untrusted and you don't want it to be able to make un validated network connections or do unvalidated alluxio/hdfs/file io. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/sanboxing-spark-executors-tp28014.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe e-mail: user-unsubscr...@spark.apache.org