Re: sanboxing spark executors

2016-11-08 Thread Michael Segel
Not that easy of a problem to solve… 

Can you impersonate the user who provided the code? 

I mean if Joe provides the lambda function, then it runs as Joe so it has joe’s 
permissions. 

Steve is right, you’d have to get down to your cluster’s security and 
authenticate the user before accepting the lambda code. You may also want to 
run with a restricted subset of permissions. 
(e.g. Joe is an admin, but he wants it to run as if its an untrusted user… this 
gets a bit more interesting.) 

And this beg’s the question… 

How are you sharing your RDDs across multiple users?  This too opens up a 
security question or two… 



> On Nov 4, 2016, at 6:13 PM, blazespinnaker  wrote:
> 
> In particular, we need to make sure the RDDs execute the lambda functions
> securely as they are provided by user code.
> 
> 
> 
> --
> View this message in context: 
> http://apache-spark-user-list.1001560.n3.nabble.com/sanboxing-spark-executors-tp28014p28024.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
> 
> -
> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
> 



Re: sanboxing spark executors

2016-11-04 Thread Michael Gummelt
Mesos will let you run in docker containers, so you get filesystem
isolation, and we're about to merge CNI support:
https://github.com/apache/spark/pull/15740, which would allow you to set up
network policies.  Though you might be able to achieve whatever network
isolation you need without CNI, depending on your requirements.

As far as unauthenticated HDFS clusters, I would recommend against running
untrusted code on the same network as your secure HDFS cluster.

On Fri, Nov 4, 2016 at 4:13 PM, blazespinnaker 
wrote:

> In particular, we need to make sure the RDDs execute the lambda functions
> securely as they are provided by user code.
>
>
>
> --
> View this message in context: http://apache-spark-user-list.
> 1001560.n3.nabble.com/sanboxing-spark-executors-tp28014p28024.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> -
> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>
>


-- 
Michael Gummelt
Software Engineer
Mesosphere


Re: sanboxing spark executors

2016-11-04 Thread blazespinnaker
In particular, we need to make sure the RDDs execute the lambda functions
securely as they are provided by user code.



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/sanboxing-spark-executors-tp28014p28024.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org



Re: sanboxing spark executors

2016-11-04 Thread Calvin Jia
Hi,

If you are using the latest Alluxio release (1.3.0), authorization is
enabled, preventing users from accessing data they do not have permissions
to. For older versions, you will need to enable the security flag. The
documentation
on security  has more
details.

Hope this helps,
Calvin

On Fri, Nov 4, 2016 at 6:31 AM, Andrew Holway <
andrew.hol...@otternetworks.de> wrote:

> I think running it on a Mesos cluster could give you better control over
> this kinda stuff.
>
>
> On Fri, Nov 4, 2016 at 7:41 AM, blazespinnaker 
> wrote:
>
>> Is there a good method / discussion / documentation on how to sandbox a
>> spark
>> executor?   Assume the code is untrusted and you don't want it to be able
>> to
>> make un validated network connections or do unvalidated alluxio/hdfs/file
>> io.
>>
>>
>>
>>
>> --
>> View this message in context: http://apache-spark-user-list.
>> 1001560.n3.nabble.com/sanboxing-spark-executors-tp28014.html
>> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>>
>> -
>> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>>
>>
>
>
> --
> Otter Networks UG
> http://otternetworks.de
> Gotenstraße 17
> 10829 Berlin
>


Re: sanboxing spark executors

2016-11-04 Thread Andrew Holway
I think running it on a Mesos cluster could give you better control over
this kinda stuff.


On Fri, Nov 4, 2016 at 7:41 AM, blazespinnaker 
wrote:

> Is there a good method / discussion / documentation on how to sandbox a
> spark
> executor?   Assume the code is untrusted and you don't want it to be able
> to
> make un validated network connections or do unvalidated alluxio/hdfs/file
> io.
>
>
>
>
> --
> View this message in context: http://apache-spark-user-list.
> 1001560.n3.nabble.com/sanboxing-spark-executors-tp28014.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> -
> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>
>


-- 
Otter Networks UG
http://otternetworks.de
Gotenstraße 17
10829 Berlin


Re: sanboxing spark executors

2016-11-04 Thread Steve Loughran

> On 4 Nov 2016, at 06:41, blazespinnaker  wrote:
> 
> Is there a good method / discussion / documentation on how to sandbox a spark
> executor?   Assume the code is untrusted and you don't want it to be able to
> make un validated network connections or do unvalidated alluxio/hdfs/file


use Kerberos to auth HDFS connections, HBase, Hive. When enabled spark 
processes (all yarn processes) will run as different users in the cluster for 
isolation there too.

no easy way to monitor/block general outbound network connections though.  

> io.
> 
> 
> 
> 
> --
> View this message in context: 
> http://apache-spark-user-list.1001560.n3.nabble.com/sanboxing-spark-executors-tp28014.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
> 
> -
> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
> 
> 


-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org