Dear Spark Users,

We are trying to launch Spark on Mesos from within a docker container. We
have found that since the Spark executors need to talk back at the Spark
driver, there is need to do a lot of port mapping to make that happen. We
seemed to have mapped the ports on what we could find from the
documentation page on spark configuration.

spark-2.1.0-bin-spark-2.1/bin/spark-submit \
>   --conf 'spark.driver.host'=<host server ip> \
>   --conf 'spark.blockManager.port'='40285' \
>   --conf 'spark.driver.bindAddress'='0.0.0.0' \
>   --conf 'spark.driver.port'='40284' \
>   --conf 'spark.mesos.executor.docker.volumes'='
> spark-2.1.0-bin-spark-2.1:/spark-2.1.0-bin-spark-2.1' \
>   --conf 'spark.mesos.gpus.max'='2' \
>   --conf 'spark.mesos.containerizer'='docker' \
>   --conf 'spark.mesos.executor.docker.image'='
> docker.drive.ai/spark_gpu_experiment:latest' \
>   --master 'mesos://mesos_master_dev:5050' \
>   -v eval.py


When we launched Spark this way, from the Mesos master log. It seems that
the mesos master is trying to make the offer back to the framework at port
33978 which turns out to be a dynamic port. The job failed at this point
because it looks like that the offer cannot reach back to the container. In
order to expose that port in the container, we'll need to make it fixed
first, does anyone know how to make that port fixed in spark configuration?
Any other advice on how to launch Spark on mesos from within docker
container is greatly appreciated

I1230 12:53:54.758297  9571 master.cpp:2424] Received SUBSCRIBE call
for framework 'eval.py' at
scheduler-8a94bc86-c2b3-4c7d-bee7-cfddc8e9a8da@172.17.0.12:33978
I1230 12:53:54.758608  9571 master.cpp:2500] Subscribing framework
eval.py with checkpointing disabled and capabilities [ GPU_RESOURCES ]
I1230 12:53:54.760036  9569 hierarchical.cpp:271] Added framework
993198d1-7393-4656-9f75-4f22702609d0-0233I1230 12:53:54.761533  9549
master.cpp:5709] Sending 1 offers to framework
993198d1-7393-4656-9f75-4f22702609d0-0233 (eval.py) at
scheduler-8a94bc86-c2b3-4c7d-bee7-cfddc8e9a8da@<some ip>:33978
E1230 12:53:57.757814  9573 process.cpp:2105] Failed to shutdown
socket with fd 22: Transport endpoint is not connectedI1230
12:53:57.758314  9543 master.cpp:1284] Framework
993198d1-7393-4656-9f75-4f22702609d0-0233 (eval.py) at
scheduler-8a94bc86-c2b3-4c7d-bee7-cfddc8e9a8da@172.17.0.12:33978
disconnected
I1230 12:53:57.758378  9543 master.cpp:2725] Disconnecting framework
993198d1-7393-4656-9f75-4f22702609d0-0233 (eval.py) at
scheduler-8a94bc86-c2b3-4c7d-bee7-cfddc8e9a8da@172.17.0.12:33978
I1230 12:53:57.758411  9543 master.cpp:2749] Deactivating framework
993198d1-7393-4656-9f75-4f22702609d0-0233 (eval.py) at
scheduler-8a94bc86-c2b3-4c7d-bee7-cfddc8e9a8da@172.17.0.12:33978
I1230 12:53:57.758582  9548 hierarchical.cpp:382] Deactivated
framework 993198d1-7393-4656-9f75-4f22702609d0-0233
W1230 12:53:57.758915  9543 master.hpp:2113] Master attempted to send
message to disconnected framework
993198d1-7393-4656-9f75-4f22702609d0-0233 (eval.py) at
scheduler-8a94bc86-c2b3-4c7d-bee7-cfddc8e9a8da@172.17.0.12:33978
I1230 12:53:57.759140  9543 master.cpp:1297] Giving framework
993198d1-7393-4656-9f75-4f22702609d0-0233 (eval.py) at
scheduler-8a94bc86-c2b3-4c7d-bee7-cfddc8e9a8da@172.17.0.12:33978 0ns
to failover
I1230 12:53:57.760573  9561 master.cpp:5561] Framework failover
timeout, removing framework 993198d1-7393-4656-9f75-4f22702609d0-0233
(eval.py) at scheduler-8a94bc86-c2b3-4c7d-bee7-cfddc8e9a8da@172.17.0.12:33978
I1230 12:53:57.760648  9561 master.cpp:6296] Removing framework
993198d1-7393-4656-9f75-4f22702609d0-0233 (eval.py) at
scheduler-8a94bc86-c2b3-4c7d-bee7-cfddc8e9a8da@172.17.0.12:33978
I1230 12:53:57.761493  9571 hierarchical.cpp:333] Removed framework
993198d1-7393-4656-9f75-4f22702609d0-0233

-- 
 

The information in this email is confidential and may be legally 
privileged. It is intended solely for the addressee. Access to this email 
by anyone else is unauthorized. If you are not the intended recipient, any 
disclosure, copying, distribution or any action taken or omitted to be 
taken in reliance on it, is prohibited and may be unlawful.

Reply via email to