Hi, You can use the Spark Kernel project (https://github.com/ibm-et/spark-kernel) as a workaround of sorts. The Spark Kernel provides a generic solution to dynamically interact with an Apache Spark cluster (think of a remote Spark Shell). It serves as the driver application with which you can send Scala code to interact with Apache Spark. You would still need to expose the Spark Kernel outside the firewall (similar to Kostas' suggestion about the jobserver), of course.
Signed, Chip Senkbeil On Thu Feb 05 2015 at 11:07:28 PM Kostas Sakellis <kos...@cloudera.com> wrote: > Yes, the driver has to be able to accept incoming connections. All the > executors connect back to the driver sending heartbeats, map status, > metrics. It is critical and I don't know of a way around it. You could look > into using something like the > https://github.com/spark-jobserver/spark-jobserver that could run outside > the firewall. Then from inside the firewall you can make REST calls to the > server. > > On Thu, Feb 5, 2015 at 5:03 PM, Kane Kim <kane.ist...@gmail.com> wrote: > >> I submit spark job from machine behind firewall, I can't open any >> incoming connections to that box, does driver absolutely need to accept >> incoming connections? Is there any workaround for that case? >> >> Thanks. >> > >