Hi Jacob,

This post might give you a brief idea about the ports being used

https://groups.google.com/forum/#!topic/spark-users/PN0WoJiB0TA





On Fri, Apr 25, 2014 at 8:53 PM, Jacob Eisinger <jeis...@us.ibm.com> wrote:

> Howdy,
>
> We tried running Spark 0.9.1 stand-alone inside docker containers
> distributed over multiple hosts. This is complicated due to Spark opening
> up ephemeral / dynamic ports for the workers and the CLI.  To ensure our
> docker solution doesn't break Spark in unexpected ways and maintains a
> secure cluster, I am interested in understanding more about Spark's network
> architecture. I'd appreciate it if you could you point us to any
> documentation!
>
> A couple specific questions:
>
>    1. What are these ports being used for?
>    Checking out the code / experiments, it looks like asynchronous
>    communication for shuffling around results. Anything else?
>    2. How do you secure the network?
>    Network administrators tend to secure and monitor the network at the
>    port level. If these ports are dynamic and open randomly, firewalls are not
>    easily configured and security alarms are raised. Is there a way to limit
>    the range easily? (We did investigate setting the kernel parameter
>    ip_local_reserved_ports, but this is broken [1] on some versions of Linux's
>    cgroups.)
>
>
> Thanks,
> Jacob
>
> [1] https://github.com/lxc/lxc/issues/97
>
> Jacob D. Eisinger
> IBM Emerging Technologies
> jeis...@us.ibm.com - (512) 286-6075

Reply via email to