Hi,

You can use the Spark Kernel project (https://github.com/ibm-et/spark-kernel)
as a workaround of sorts. The Spark Kernel provides a generic solution to
dynamically interact with an Apache Spark cluster (think of a remote Spark
Shell). It serves as the driver application with which you can send Scala
code to interact with Apache Spark. You would still need to expose the
Spark Kernel outside the firewall (similar to Kostas' suggestion about the
jobserver), of course.

Signed,
Chip Senkbeil

On Thu Feb 05 2015 at 11:07:28 PM Kostas Sakellis <kos...@cloudera.com>
wrote:

> Yes, the driver has to be able to accept incoming connections. All the
> executors connect back to the driver sending heartbeats, map status,
> metrics. It is critical and I don't know of a way around it. You could look
> into using something like the
> https://github.com/spark-jobserver/spark-jobserver that could run outside
> the firewall. Then from inside the firewall you can make REST calls to the
> server.
>
> On Thu, Feb 5, 2015 at 5:03 PM, Kane Kim <kane.ist...@gmail.com> wrote:
>
>> I submit spark job from machine behind firewall, I can't open any
>> incoming connections to that box, does driver absolutely need to accept
>> incoming connections? Is there any workaround for that case?
>>
>> Thanks.
>>
>
>

Reply via email to