Hi,
You can use the Spark Kernel project (https://github.com/ibm-et/spark-kernel)
as a workaround of sorts. The Spark Kernel provides a generic solution to
dynamically interact with an Apache Spark cluster (think of a remote Spark
Shell). It serves as the driver application with which you can
Yes, the driver has to be able to accept incoming connections. All the
executors connect back to the driver sending heartbeats, map status,
metrics. It is critical and I don't know of a way around it. You could look
into using something like the
https://github.com/spark-jobserver/spark-jobserver
I submit spark job from machine behind firewall, I can't open any incoming
connections to that box, does driver absolutely need to accept incoming
connections? Is there any workaround for that case?
Thanks.