All,
I was wondering if any of you have solved this problem :

I have pyspark(ipython mode) running on docker talking to
a yarn cluster(AM/executors are NOT running on docker).

When I start pyspark in the docker container, it binds to port *49460.*

Once the app is submitted to YARN, the app(AM) on the cluster side fails
with the following error message :
*ERROR yarn.ApplicationMaster: Failed to connect to driver at :49460*

This makes sense because AM is trying to talk to container directly and
it cannot, it should be talking to the docker host instead.

*Question* :
How do we make Spark AM talk to host1:port1 of the docker host(not the
container), which would then
route it to container which is running pyspark on host2:port2 ?

One solution I could think of is : after starting the driver(say on
hostA:portA), and before submitting the app to yarn, we could
reset driver's host/port to hostmachine's ip/port. So the AM can then talk
hostmachine's ip/port, which would be mapped
to the container.

Thoughts ?
-- 
Thanks,
Ashwin

Reply via email to