Dear Spark Users, We are trying to launch Spark on Mesos from within a docker container. We have found that since the Spark executors need to talk back at the Spark driver, there is need to do a lot of port mapping to make that happen. We seemed to have mapped the ports on what we could find from the documentation page on spark configuration.
spark-2.1.0-bin-spark-2.1/bin/spark-submit \ > --conf 'spark.driver.host'=<host server ip> \ > --conf 'spark.blockManager.port'='40285' \ > --conf 'spark.driver.bindAddress'='0.0.0.0' \ > --conf 'spark.driver.port'='40284' \ > --conf 'spark.mesos.executor.docker.volumes'=' > spark-2.1.0-bin-spark-2.1:/spark-2.1.0-bin-spark-2.1' \ > --conf 'spark.mesos.gpus.max'='2' \ > --conf 'spark.mesos.containerizer'='docker' \ > --conf 'spark.mesos.executor.docker.image'=' > docker.drive.ai/spark_gpu_experiment:latest' \ > --master 'mesos://mesos_master_dev:5050' \ > -v eval.py When we launched Spark this way, from the Mesos master log. It seems that the mesos master is trying to make the offer back to the framework at port 33978 which turns out to be a dynamic port. The job failed at this point because it looks like that the offer cannot reach back to the container. In order to expose that port in the container, we'll need to make it fixed first, does anyone know how to make that port fixed in spark configuration? Any other advice on how to launch Spark on mesos from within docker container is greatly appreciated I1230 12:53:54.758297 9571 master.cpp:2424] Received SUBSCRIBE call for framework 'eval.py' at scheduler-8a94bc86-c2b3-4c7d-bee7-cfddc8e9a8da@172.17.0.12:33978 I1230 12:53:54.758608 9571 master.cpp:2500] Subscribing framework eval.py with checkpointing disabled and capabilities [ GPU_RESOURCES ] I1230 12:53:54.760036 9569 hierarchical.cpp:271] Added framework 993198d1-7393-4656-9f75-4f22702609d0-0233I1230 12:53:54.761533 9549 master.cpp:5709] Sending 1 offers to framework 993198d1-7393-4656-9f75-4f22702609d0-0233 (eval.py) at scheduler-8a94bc86-c2b3-4c7d-bee7-cfddc8e9a8da@<some ip>:33978 E1230 12:53:57.757814 9573 process.cpp:2105] Failed to shutdown socket with fd 22: Transport endpoint is not connectedI1230 12:53:57.758314 9543 master.cpp:1284] Framework 993198d1-7393-4656-9f75-4f22702609d0-0233 (eval.py) at scheduler-8a94bc86-c2b3-4c7d-bee7-cfddc8e9a8da@172.17.0.12:33978 disconnected I1230 12:53:57.758378 9543 master.cpp:2725] Disconnecting framework 993198d1-7393-4656-9f75-4f22702609d0-0233 (eval.py) at scheduler-8a94bc86-c2b3-4c7d-bee7-cfddc8e9a8da@172.17.0.12:33978 I1230 12:53:57.758411 9543 master.cpp:2749] Deactivating framework 993198d1-7393-4656-9f75-4f22702609d0-0233 (eval.py) at scheduler-8a94bc86-c2b3-4c7d-bee7-cfddc8e9a8da@172.17.0.12:33978 I1230 12:53:57.758582 9548 hierarchical.cpp:382] Deactivated framework 993198d1-7393-4656-9f75-4f22702609d0-0233 W1230 12:53:57.758915 9543 master.hpp:2113] Master attempted to send message to disconnected framework 993198d1-7393-4656-9f75-4f22702609d0-0233 (eval.py) at scheduler-8a94bc86-c2b3-4c7d-bee7-cfddc8e9a8da@172.17.0.12:33978 I1230 12:53:57.759140 9543 master.cpp:1297] Giving framework 993198d1-7393-4656-9f75-4f22702609d0-0233 (eval.py) at scheduler-8a94bc86-c2b3-4c7d-bee7-cfddc8e9a8da@172.17.0.12:33978 0ns to failover I1230 12:53:57.760573 9561 master.cpp:5561] Framework failover timeout, removing framework 993198d1-7393-4656-9f75-4f22702609d0-0233 (eval.py) at scheduler-8a94bc86-c2b3-4c7d-bee7-cfddc8e9a8da@172.17.0.12:33978 I1230 12:53:57.760648 9561 master.cpp:6296] Removing framework 993198d1-7393-4656-9f75-4f22702609d0-0233 (eval.py) at scheduler-8a94bc86-c2b3-4c7d-bee7-cfddc8e9a8da@172.17.0.12:33978 I1230 12:53:57.761493 9571 hierarchical.cpp:333] Removed framework 993198d1-7393-4656-9f75-4f22702609d0-0233 -- The information in this email is confidential and may be legally privileged. It is intended solely for the addressee. Access to this email by anyone else is unauthorized. If you are not the intended recipient, any disclosure, copying, distribution or any action taken or omitted to be taken in reliance on it, is prohibited and may be unlawful.