Hi Ji,

One way to make it fixed is to set LIBPROCESS_PORT environment variable on the 
executor when it is launched.

Tim


> On Dec 30, 2016, at 1:23 PM, Ji Yan <ji...@drive.ai> wrote:
> 
> Dear Spark Users,
> 
> We are trying to launch Spark on Mesos from within a docker container. We 
> have found that since the Spark executors need to talk back at the Spark 
> driver, there is need to do a lot of port mapping to make that happen. We 
> seemed to have mapped the ports on what we could find from the documentation 
> page on spark configuration.
> 
>> spark-2.1.0-bin-spark-2.1/bin/spark-submit \
>>   --conf 'spark.driver.host'=<host server ip> \
>>   --conf 'spark.blockManager.port'='40285' \
>>   --conf 'spark.driver.bindAddress'='0.0.0.0' \
>>   --conf 'spark.driver.port'='40284' \
>>   --conf 
>> 'spark.mesos.executor.docker.volumes'='spark-2.1.0-bin-spark-2.1:/spark-2.1.0-bin-spark-2.1'
>>  \
>>   --conf 'spark.mesos.gpus.max'='2' \
>>   --conf 'spark.mesos.containerizer'='docker' \
>>   --conf 
>> 'spark.mesos.executor.docker.image'='docker.drive.ai/spark_gpu_experiment:latest'
>>  \
>>   --master 'mesos://mesos_master_dev:5050' \
>>   -v eval.py
> 
> When we launched Spark this way, from the Mesos master log. It seems that the 
> mesos master is trying to make the offer back to the framework at port 33978 
> which turns out to be a dynamic port. The job failed at this point because it 
> looks like that the offer cannot reach back to the container. In order to 
> expose that port in the container, we'll need to make it fixed first, does 
> anyone know how to make that port fixed in spark configuration? Any other 
> advice on how to launch Spark on mesos from within docker container is 
> greatly appreciated
> 
> I1230 12:53:54.758297  9571 master.cpp:2424] Received SUBSCRIBE call for 
> framework 'eval.py' at 
> scheduler-8a94bc86-c2b3-4c7d-bee7-cfddc8e9a8da@172.17.0.12:33978
> I1230 12:53:54.758608  9571 master.cpp:2500] Subscribing framework eval.py 
> with checkpointing disabled and capabilities [ GPU_RESOURCES ]
> I1230 12:53:54.760036  9569 hierarchical.cpp:271] Added framework 
> 993198d1-7393-4656-9f75-4f22702609d0-0233
> I1230 12:53:54.761533  9549 master.cpp:5709] Sending 1 offers to framework 
> 993198d1-7393-4656-9f75-4f22702609d0-0233 (eval.py) at 
> scheduler-8a94bc86-c2b3-4c7d-bee7-cfddc8e9a8da@<some ip>:33978
> E1230 12:53:57.757814  9573 process.cpp:2105] Failed to shutdown socket with 
> fd 22: Transport endpoint is not connected
> I1230 12:53:57.758314  9543 master.cpp:1284] Framework 
> 993198d1-7393-4656-9f75-4f22702609d0-0233 (eval.py) at 
> scheduler-8a94bc86-c2b3-4c7d-bee7-cfddc8e9a8da@172.17.0.12:33978 disconnected
> I1230 12:53:57.758378  9543 master.cpp:2725] Disconnecting framework 
> 993198d1-7393-4656-9f75-4f22702609d0-0233 (eval.py) at 
> scheduler-8a94bc86-c2b3-4c7d-bee7-cfddc8e9a8da@172.17.0.12:33978
> I1230 12:53:57.758411  9543 master.cpp:2749] Deactivating framework 
> 993198d1-7393-4656-9f75-4f22702609d0-0233 (eval.py) at 
> scheduler-8a94bc86-c2b3-4c7d-bee7-cfddc8e9a8da@172.17.0.12:33978
> I1230 12:53:57.758582  9548 hierarchical.cpp:382] Deactivated framework 
> 993198d1-7393-4656-9f75-4f22702609d0-0233
> W1230 12:53:57.758915  9543 master.hpp:2113] Master attempted to send message 
> to disconnected framework 993198d1-7393-4656-9f75-4f22702609d0-0233 (eval.py) 
> at scheduler-8a94bc86-c2b3-4c7d-bee7-cfddc8e9a8da@172.17.0.12:33978
> I1230 12:53:57.759140  9543 master.cpp:1297] Giving framework 
> 993198d1-7393-4656-9f75-4f22702609d0-0233 (eval.py) at 
> scheduler-8a94bc86-c2b3-4c7d-bee7-cfddc8e9a8da@172.17.0.12:33978 0ns to 
> failover
> I1230 12:53:57.760573  9561 master.cpp:5561] Framework failover timeout, 
> removing framework 993198d1-7393-4656-9f75-4f22702609d0-0233 (eval.py) at 
> scheduler-8a94bc86-c2b3-4c7d-bee7-cfddc8e9a8da@172.17.0.12:33978
> I1230 12:53:57.760648  9561 master.cpp:6296] Removing framework 
> 993198d1-7393-4656-9f75-4f22702609d0-0233 (eval.py) at 
> scheduler-8a94bc86-c2b3-4c7d-bee7-cfddc8e9a8da@172.17.0.12:33978
> I1230 12:53:57.761493  9571 hierarchical.cpp:333] Removed framework 
> 993198d1-7393-4656-9f75-4f22702609d0-0233
> 
> The information in this email is confidential and may be legally privileged. 
> It is intended solely for the addressee. Access to this email by anyone else 
> is unauthorized. If you are not the intended recipient, any disclosure, 
> copying, distribution or any action taken or omitted to be taken in reliance 
> on it, is prohibited and may be unlawful.

Reply via email to