Thanks Jojy. I will patch this in version 0.24.1, and rebuild it. I will let you know if it work well after I finish testing.
发件人: Jojy Varghese <j...@mesosphere.io<mailto:j...@mesosphere.io>> 答复: "user@mesos.apache.org<mailto:user@mesos.apache.org>" <user@mesos.apache.org<mailto:user@mesos.apache.org>> 日期: 2015年10月31日 星期六 上午12:45 至: "user@mesos.apache.org<mailto:user@mesos.apache.org>" <user@mesos.apache.org<mailto:user@mesos.apache.org>> 主题: Re: Can't start docker container when SSL_ENABLED is on. Thanks Xiaodong. Based on the hypothesis that the container process launched with SSL_ENABLED in environment is the problem, I have created a patch https://reviews.apache.org/r/39818/. This might be a quick and dirty was to test the hypothesis. Would it be possible for you to test again after applying the patch? -Jojy On Oct 30, 2015, at 8:29 AM, Xiaodong Zhang <xdzh...@alauda.io<mailto:xdzh...@alauda.io>> wrote: Thanks @Jojy Flags at startup: --appc_store_dir="/tmp/mesos/store/appc" --authenticatee="crammd5" --cgroups_cpu_enable_pids_and_tids_count="false" --cgroups_enable_cfs="false" --cgroups_hierarchy="/sys/fs/cgroup" --cgroups_limit_swap="false" --cgroups_root="mesos" --container_disk_watch_interval="15secs" --containerizers="docker,mesos" --credential="/etc/mesos-slave-auth" --default_role="*" --disk_watch_interval="1mins" --docker="/usr/bin/docker" --docker_kill_orphans="true" --docker_remove_delay="6hrs" --docker_socket="/var/run/docker.sock" --docker_stop_timeout="0ns" --enforce_container_disk_quota="false" --executor_registration_timeout="1hrs" --executor_shutdown_grace_period="5secs" --fetcher_cache_dir="/tmp/mesos/fetch" --fetcher_cache_size="2GB" --frameworks_home="" --gc_delay="1weeks" --gc_disk_headroom="0.1" --hadoop_home="" --help="false" --initialize_driver_logging="true" --isolation="posix/cpu,posix/mem" --launcher_dir="/usr/libexec/mesos" --log_dir="/var/log/mesos" --logbufsecs="0" --logging_level="INFO" --master="zk://172.31.43.77:2181,172.31.44.2:2181,172.31.36.91:2181/mesos" --oversubscribed_resources_interval="15secs" --perf_duration="10secs" --perf_interval="1mins" --port="5051" --qos_correction_interval_min="0ns" --quiet="false" --recover="reconnect" --recovery_timeout="15mins" --registration_backoff_factor="1secs" --resource_monitoring_interval="1secs" --revocable_cpu_low_priority="true" --sandbox_directory="/mnt/mesos/sandbox" --strict="true" --switch_user="true" --version="false" --work_dir="/tmp/mesos" 发件人: Jojy Varghese <j...@mesosphere.io<mailto:j...@mesosphere.io>> 答复: "user@mesos.apache.org<mailto:user@mesos.apache.org>" <user@mesos.apache.org<mailto:user@mesos.apache.org>> 日期: 2015年10月30日 星期五 下午11:17 至: "user@mesos.apache.org<mailto:user@mesos.apache.org>" <user@mesos.apache.org<mailto:user@mesos.apache.org>> 主题: Re: Can't start docker container when SSL_ENABLED is on. Hi Xiaodong This might be because the executor inherits the SSL environment variables of slave and thus expects SSL key password to launch. Could you please add the part of the slave logs that says "Flags at startup” so that we can have more information? thanks Jojy On Oct 29, 2015, at 8:55 PM, Xiaodong Zhang <xdzh...@alauda.io<mailto:xdzh...@alauda.io>> wrote: Thanks a lot !~ @haosent 发件人: haosdent <haosd...@gmail.com<mailto:haosd...@gmail.com>> 答复: "user@mesos.apache.org<mailto:user@mesos.apache.org>" <user@mesos.apache.org<mailto:user@mesos.apache.org>> 日期: 2015年10月30日 星期五 上午11:45 至: user <user@mesos.apache.org<mailto:user@mesos.apache.org>> 主题: Re: Can't start docker container when SSL_ENABLED is on. Hi, @Xiaodong I interested in your problem. But recently days I don't have enough time to try reproduce your problem. I think I could try to dig your problem at this Sunday and give you feedback. On Fri, Oct 30, 2015 at 11:30 AM, Xiaodong Zhang <xdzh...@alauda.io<mailto:xdzh...@alauda.io>> wrote: Anybody know about this? 发件人: Xiaodong Zhang <xdzh...@alauda.io<mailto:xdzh...@alauda.io>> 答复: "user@mesos.apache.org<mailto:user@mesos.apache.org>" <user@mesos.apache.org<mailto:user@mesos.apache.org>> 日期: 2015年10月29日 星期四 下午7:38 至: "user@mesos.apache.org<mailto:user@mesos.apache.org>" <user@mesos.apache.org<mailto:user@mesos.apache.org>> 主题: Re: Can't start docker container when SSL_ENABLED is on. I think it is easy to reproduce this error. Start master with env: SSL_SUPPORT_DOWNGRADE SSL_ENABLED SSL_KEY_FILE SSL_CERT_FILE Start slave with env: SSL_ENABLED SSL_KEY_FILE SSL_CERT_FILE LIBPROCESS_ADVERTISE_IP Then run a docker task via marathon. 发件人: Xiaodong Zhang <xdzh...@alauda.io<mailto:xdzh...@alauda.io>> 日期: 2015年10月29日 星期四 下午3:09 至: "user@mesos.apache.org<mailto:user@mesos.apache.org>" <user@mesos.apache.org<mailto:user@mesos.apache.org>> 主题: Re: Can't start docker container when SSL_ENABLED is on. So now, mesos task work well but docker task doesn’t. 发件人: Xiaodong Zhang <xdzh...@alauda.io<mailto:xdzh...@alauda.io>> 答复: "user@mesos.apache.org<mailto:user@mesos.apache.org>" <user@mesos.apache.org<mailto:user@mesos.apache.org>> 日期: 2015年10月29日 星期四 下午2:08 至: "user@mesos.apache.org<mailto:user@mesos.apache.org>" <user@mesos.apache.org<mailto:user@mesos.apache.org>> 主题: Re: Can't start docker container when SSL_ENABLED is on. I run a task by marathon: { "id": "basic-0", "cmd": "while [ true ] ; do echo 'Hello Marathon' ; sleep 5 ; done", "cpus": 0.1, "mem": 10.0, "instances": 1} It works well. <742629F2-78E8-43F2-9015-F3D22720826B.png> Docker task can pull image but can’t run as I mentioned. My docker version 1.5.0 发件人: Tim Chen <t...@mesosphere.io<mailto:t...@mesosphere.io>> 答复: "user@mesos.apache.org<mailto:user@mesos.apache.org>" <user@mesos.apache.org<mailto:user@mesos.apache.org>> 日期: 2015年10月29日 星期四 下午1:48 至: "user@mesos.apache.org<mailto:user@mesos.apache.org>" <user@mesos.apache.org<mailto:user@mesos.apache.org>> 主题: Re: Can't start docker container when SSL_ENABLED is on. Does running a task without docker container (Mesos containerizer) works with ssl in your environment? Tim On Wed, Oct 28, 2015 at 10:19 PM, Xiaodong Zhang <xdzh...@alauda.io<mailto:xdzh...@alauda.io>> wrote: Thanks a lot. I find the log file in slave. One of the task: Stdout: --container="mesos-20151029-043755-3549436724-5050-5674-S0.e2c2580f-8082-4f17-b0cc-4e32e040d444" --docker="/home/ubuntu/luna/bin/docker" --help="false" --initialize_driver_logging="true" --logbufsecs="0" --logging_level="INFO" --mapped_directory="/mnt/mesos/sandbox" --quiet="false" --sandbox_directory="/tmp/mesos/slaves/20151029-043755-3549436724-5050-5674-S0/frameworks/20151029-043755-3549436724-5050-5674-0000/executors/e4a3bed5-64e6-4970-8bb1-df6404656a48.e3a20f3b-7dfb-11e5-b57b-0247b493b22f/runs/e2c2580f-8082-4f17-b0cc-4e32e040d444" --stop_timeout="0ns" --container="mesos-20151029-043755-3549436724-5050-5674-S0.e2c2580f-8082-4f17-b0cc-4e32e040d444" --docker="/home/ubuntu/luna/bin/docker" --help="false" --initialize_driver_logging="true" --logbufsecs="0" --logging_level="INFO" --mapped_directory="/mnt/mesos/sandbox" --quiet="false" --sandbox_directory="/tmp/mesos/slaves/20151029-043755-3549436724-5050-5674-S0/frameworks/20151029-043755-3549436724-5050-5674-0000/executors/e4a3bed5-64e6-4970-8bb1-df6404656a48.e3a20f3b-7dfb-11e5-b57b-0247b493b22f/runs/e2c2580f-8082-4f17-b0cc-4e32e040d444" --stop_timeout="0ns" Shutting down Stderr: I1029 05:14:06.529364 27862 fetcher.cpp:414] Fetcher Info: {"cache_directory":"\/tmp\/mesos\/fetch\/slaves\/20151029-043755-3549436724-5050-5674-S0","items":[{"action":"BYPASS_CACHE","uri":{"extract":false,"value":"file:\/\/\/etc\/.dockercfg"}}],"sandbox_directory":"\/tmp\/mesos\/slaves\/20151029-043755-3549436724-5050-5674-S0\/frameworks\/20151029-043755-3549436724-5050-5674-0000\/executors\/e4a3bed5-64e6-4970-8bb1-df6404656a48.e3a20f3b-7dfb-11e5-b57b-0247b493b22f\/runs\/e2c2580f-8082-4f17-b0cc-4e32e040d444"} I1029 05:14:06.530562 27862 fetcher.cpp:369] Fetching URI 'file:///etc/.dockercfg' I1029 05:14:06.530580 27862 fetcher.cpp:243] Fetching directly into the sandbox directory I1029 05:14:06.530594 27862 fetcher.cpp:180] Fetching URI 'file:///etc/.dockercfg' I1029 05:14:06.530609 27862 fetcher.cpp:160] Copying resource with command:cp '/etc/.dockercfg' '/tmp/mesos/slaves/20151029-043755-3549436724-5050-5674-S0/frameworks/20151029-043755-3549436724-5050-5674-0000/executors/e4a3bed5-64e6-4970-8bb1-df6404656a48.e3a20f3b-7dfb-11e5-b57b-0247b493b22f/runs/e2c2580f-8082-4f17-b0cc-4e32e040d444/.dockercfg' I1029 05:14:06.532165 27862 fetcher.cpp:446] Fetched 'file:///etc/.dockercfg' to '/tmp/mesos/slaves/20151029-043755-3549436724-5050-5674-S0/frameworks/20151029-043755-3549436724-5050-5674-0000/executors/e4a3bed5-64e6-4970-8bb1-df6404656a48.e3a20f3b-7dfb-11e5-b57b-0247b493b22f/runs/e2c2580f-8082-4f17-b0cc-4e32e040d444/.dockercfg' I1029 05:14:07.782054 27955 exec.cpp:133] Version: 0.24.1 I1029 05:14:07.785039 27963 exec.cpp:462] Slave exited ... shutting down E1029 05:14:07.785158 27964 socket.hpp:174] Shutdown failed on fd=7: Transport endpoint is not connected [107] 发件人: haosdent <haosd...@gmail.com<mailto:haosd...@gmail.com>> 答复: "user@mesos.apache.org<mailto:user@mesos.apache.org>" <user@mesos.apache.org<mailto:user@mesos.apache.org>> 日期: 2015年10月29日 星期四 下午1:13 至: user <user@mesos.apache.org<mailto:user@mesos.apache.org>> 主题: Re: Can't start docker container when SSL_ENABLED is on. <5185_02_04.png> <5185_02_07.png> I capture how I find tasks log in my local webui, could you find the stderr and stdout for your tasks according above screenshots? On Thu, Oct 29, 2015 at 1:07 PM, Xiaodong Zhang <xdzh...@alauda.io<mailto:xdzh...@alauda.io>> wrote: I didn’t see some useful info. In mesos slave log, there is a line : I1029 03:29:53.160143 9292 slave.cpp:3399] Executor '279bcb34-f705-4857-96ad-d96843b848fb.4b3abdcd-7ded-11e5-a82d-0240afabf713' of framework 20151029-031549-1294671788-5050-4937-0000 terminated with signal Killed I check the normal log, it shows: I1014 15:22:21.276007 23163 slave.cpp:3326] Executor 'ffc08dce-997f-41f7-9b03-57c1b4bc1f85.47ed02aa-7285-11e5-80d7-000d3a8033de' of framework 20150814-115157-1677721866-5050-6185-0000 exited with status 0 Is this helpful? 发件人: Xiaodong Zhang <xdzh...@alauda.io<mailto:xdzh...@alauda.io>> 答复: "user@mesos.apache.org<mailto:user@mesos.apache.org>" <user@mesos.apache.org<mailto:user@mesos.apache.org>> 日期: 2015年10月29日 星期四 下午12:59 至: "user@mesos.apache.org<mailto:user@mesos.apache.org>" <user@mesos.apache.org<mailto:user@mesos.apache.org>> 主题: Re: Can't start docker container when SSL_ENABLED is on. <9D46724C-457C-4BE1-B0E4-F57B147F6DC8.png> The webui have a LOG link, when click it shows like this: I1029 04:44:32.293445 5697 http.cpp:321] HTTP GET for /master/state.json from 114.113.20.135:55682<http://114.113.20.135:55682/> with User-Agent='Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/46.0.2490.71 Safari/537.36' I1029 04:44:34.533504 5704 master.cpp:4613] Sending 1 offers to framework 20151029-043755-3549436724-5050-5674-0000 (marathon) at scheduler-b532233f-2fc5-4455-b1e6-7a66ae79a8b9@172.31.43.77<mailto:scheduler-b532233f-2fc5-4455-b1e6-7a66ae79a8b9@172.31.43.77>:53373 I1029 04:44:34.539579 5702 master.cpp:2739] Processing ACCEPT call for offers: [ 20151029-043755-3549436724-5050-5674-O2 ] on slave 20151029-043755-3549436724-5050-5674-S0 at slave(1)@50.112.136.148:5051<http://50.112.136.148:5051/> (ec2-50-112-136-148.us-west-2.compute.amazonaws.com<http://ec2-50-112-136-148.us-west-2.compute.amazonaws.com/>) for framework 20151029-043755-3549436724-5050-5674-0000 (marathon) at scheduler-b532233f-2fc5-4455-b1e6-7a66ae79a8b9@172.31.43.77<mailto:scheduler-b532233f-2fc5-4455-b1e6-7a66ae79a8b9@172.31.43.77>:53373 I1029 04:44:34.539710 5702 hierarchical.hpp:814] Recovered cpus(*):1; mem(*):999; disk(*):3962; ports(*):[31000-32000] (total: cpus(*):1; mem(*):999; disk(*):3962; ports(*):[31000-32000], allocated: ) on slave 20151029-043755-3549436724-5050-5674-S0 from framework 20151029-043755-3549436724-5050-5674-0000 I1029 04:44:37.360901 5703 master.cpp:4294] Performing implicit task state reconciliation for framework 20151029-043755-3549436724-5050-5674-0000 (marathon) at scheduler-b532233f-2fc5-4455-b1e6-7a66ae79a8b9@172.31.43.77<mailto:scheduler-b532233f-2fc5-4455-b1e6-7a66ae79a8b9@172.31.43.77>:53373 I1029 04:44:40.539989 5704 master.cpp:4613] Sending 1 offers to framework 20151029-043755-3549436724-5050-5674-0000 (marathon) at scheduler-b532233f-2fc5-4455-b1e6-7a66ae79a8b9@172.31.43.77<mailto:scheduler-b532233f-2fc5-4455-b1e6-7a66ae79a8b9@172.31.43.77>:53373 I1029 04:44:40.610321 5702 master.cpp:2739] Processing ACCEPT call for offers: [ 20151029-043755-3549436724-5050-5674-O3 ] on slave 20151029-043755-3549436724-5050-5674-S0 at slave(1)@50.112.136.148:5051<http://50.112.136.148:5051/> (ec2-50-112-136-148.us-west-2.compute.amazonaws.com<http://ec2-50-112-136-148.us-west-2.compute.amazonaws.com/>) for framework 20151029-043755-3549436724-5050-5674-0000 (marathon) at scheduler-b532233f-2fc5-4455-b1e6-7a66ae79a8b9@172.31.43.77<mailto:scheduler-b532233f-2fc5-4455-b1e6-7a66ae79a8b9@172.31.43.77>:53373 I1029 04:44:40.610846 5702 master.hpp:170] Adding task e4a3bed5-64e6-4970-8bb1-df6404656a48.c4239b84-7df7-11e5-b57b-0247b493b22f with resources cpus(*):0.0625; mem(*):256; ports(*):[31864-31864] on slave 20151029-043755-3549436724-5050-5674-S0 (ec2-50-112-136-148.us-west-2.compute.amazonaws.com<http://ec2-50-112-136-148.us-west-2.compute.amazonaws.com/>) I1029 04:44:40.610911 5702 master.cpp:3069] Launching task e4a3bed5-64e6-4970-8bb1-df6404656a48.c4239b84-7df7-11e5-b57b-0247b493b22f of framework 20151029-043755-3549436724-5050-5674-0000 (marathon) at scheduler-b532233f-2fc5-4455-b1e6-7a66ae79a8b9@172.31.43.77<mailto:scheduler-b532233f-2fc5-4455-b1e6-7a66ae79a8b9@172.31.43.77>:53373 with resources cpus(*):0.0625; mem(*):256; ports(*):[31864-31864] on slave 20151029-043755-3549436724-5050-5674-S0 at slave(1)@50.112.136.148:5051<http://50.112.136.148:5051/> (ec2-50-112-136-148.us-west-2.compute.amazonaws.com<http://ec2-50-112-136-148.us-west-2.compute.amazonaws.com/>) I1029 04:44:40.611095 5702 hierarchical.hpp:814] Recovered cpus(*):0.9375; mem(*):743; disk(*):3962; ports(*):[31000-31863, 31865-32000] (total: cpus(*):1; mem(*):999; disk(*):3962; ports(*):[31000-32000], allocated: cpus(*):0.0625; mem(*):256; ports(*):[31864-31864]) on slave 20151029-043755-3549436724-5050-5674-S0 from framework 20151029-043755-3549436724-5050-5674-0000 I1029 04:44:43.324970 5698 http.cpp:321] HTTP GET for /master/state.json from 114.113.20.135:55682<http://114.113.20.135:55682/> with User-Agent='Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/46.0.2490.71 Safari/537.36' I1029 04:44:46.546671 5703 master.cpp:4613] Sending 1 offers to framework 20151029-043755-3549436724-5050-5674-0000 (marathon) at scheduler-b532233f-2fc5-4455-b1e6-7a66ae79a8b9@172.31.43.77<mailto:scheduler-b532233f-2fc5-4455-b1e6-7a66ae79a8b9@172.31.43.77>:53373 I1029 04:44:46.557266 5699 master.cpp:2739] Processing ACCEPT call for offers: [ 20151029-043755-3549436724-5050-5674-O4 ] on slave 20151029-043755-3549436724-5050-5674-S0 at slave(1)@50.112.136.148:5051<http://50.112.136.148:5051/> (ec2-50-112-136-148.us-west-2.compute.amazonaws.com<http://ec2-50-112-136-148.us-west-2.compute.amazonaws.com/>) for framework 20151029-043755-3549436724-5050-5674-0000 (marathon) at scheduler-b532233f-2fc5-4455-b1e6-7a66ae79a8b9@172.31.43.77<mailto:scheduler-b532233f-2fc5-4455-b1e6-7a66ae79a8b9@172.31.43.77>:53373 I1029 04:44:46.557394 5699 hierarchical.hpp:814] Recovered cpus(*):0.9375; mem(*):743; disk(*):3962; ports(*):[31000-31863, 31865-32000] (total: cpus(*):1; mem(*):999; disk(*):3962; ports(*):[31000-32000], allocated: cpus(*):0.0625; mem(*):256; ports(*):[31864-31864]) on slave 20151029-043755-3549436724-5050-5674-S0 from framework 20151029-043755-3549436724-5050-5674-0000 I1029 04:44:47.267562 5700 master.cpp:4069] Status update TASK_FAILED (UUID: 0ea607fc-bf24-4bda-b107-55a54aba31cf) for task e4a3bed5-64e6-4970-8bb1-df6404656a48.c4239b84-7df7-11e5-b57b-0247b493b22f of framework 20151029-043755-3549436724-5050-5674-0000 from slave 20151029-043755-3549436724-5050-5674-S0 at slave(1)@50.112.136.148:5051<http://50.112.136.148:5051/> (ec2-50-112-136-148.us-west-2.compute.amazonaws.com<http://ec2-50-112-136-148.us-west-2.compute.amazonaws.com/>) I1029 04:44:47.267645 5700 master.cpp:4108] Forwarding status update TASK_FAILED (UUID: 0ea607fc-bf24-4bda-b107-55a54aba31cf) for task e4a3bed5-64e6-4970-8bb1-df6404656a48.c4239b84-7df7-11e5-b57b-0247b493b22f of framework 20151029-043755-3549436724-5050-5674-0000 I1029 04:44:47.267774 5700 master.cpp:5576] Updating the latest state of task e4a3bed5-64e6-4970-8bb1-df6404656a48.c4239b84-7df7-11e5-b57b-0247b493b22f of framework 20151029-043755-3549436724-5050-5674-0000 to TASK_FAILED I1029 04:44:47.267907 5700 hierarchical.hpp:814] Recovered cpus(*):0.0625; mem(*):256; ports(*):[31864-31864] (total: cpus(*):1; mem(*):999; disk(*):3962; ports(*):[31000-32000], allocated: ) on slave 20151029-043755-3549436724-5050-5674-S0 from framework 20151029-043755-3549436724-5050-5674-0000 I1029 04:44:47.289356 5698 master.cpp:5644] Removing task e4a3bed5-64e6-4970-8bb1-df6404656a48.c4239b84-7df7-11e5-b57b-0247b493b22f with resources cpus(*):0.0625; mem(*):256; ports(*):[31864-31864] of framework 20151029-043755-3549436724-5050-5674-0000 on slave 20151029-043755-3549436724-5050-5674-S0 at slave(1)@50.112.136.148:5051<http://50.112.136.148:5051/> (ec2-50-112-136-148.us-west-2.compute.amazonaws.com<http://ec2-50-112-136-148.us-west-2.compute.amazonaws.com/>) I1029 04:44:47.289459 5698 master.cpp:3398] Processing ACKNOWLEDGE call 0ea607fc-bf24-4bda-b107-55a54aba31cf for task e4a3bed5-64e6-4970-8bb1-df6404656a48.c4239b84-7df7-11e5-b57b-0247b493b22f of framework 20151029-043755-3549436724-5050-5674-0000 (marathon) at scheduler-b532233f-2fc5-4455-b1e6-7a66ae79a8b9@172.31.43.77<mailto:scheduler-b532233f-2fc5-4455-b1e6-7a66ae79a8b9@172.31.43.77>:53373 on slave 20151029-043755-3549436724-5050-5674-S0 发件人: haosdent <haosd...@gmail.com<mailto:haosd...@gmail.com>> 答复: "user@mesos.apache.org<mailto:user@mesos.apache.org>" <user@mesos.apache.org<mailto:user@mesos.apache.org>> 日期: 2015年10月29日 星期四 下午12:02 至: user <user@mesos.apache.org<mailto:user@mesos.apache.org>> 主题: Re: Can't start docker container when SSL_ENABLED is on. Oh, I mean you task logs. They could be get from Mesos webui. On Thu, Oct 29, 2015 at 11:52 AM, Xiaodong Zhang <xdzh...@alauda.io<mailto:xdzh...@alauda.io>> wrote: Thanks for your reply. Yes I build mesos with `--enable-libevent --enable-ssl`. If I don’t provide key and pem when start slave, it will register fail(That means the ssl work well right?) As I said the odd thing is the container nerver run(`docker ps –a show nothing`). So it can’t have any stdout or stderr. 发件人: haosdent <haosd...@gmail.com<mailto:haosd...@gmail.com>> 答复: "user@mesos.apache.org<mailto:user@mesos.apache.org>" <user@mesos.apache.org<mailto:user@mesos.apache.org>> 日期: 2015年10月29日 星期四 上午11:47 至: user <user@mesos.apache.org<mailto:user@mesos.apache.org>> 主题: Re: Can't start docker container when SSL_ENABLED is on. Do you compile mesos with ssl support? The default compile don't contains ssl. And does docker container have stdour and stderr? On Thu, Oct 29, 2015 at 11:41 AM, Xiaodong Zhang <xdzh...@alauda.io<mailto:xdzh...@alauda.io>> wrote: My scenarios is like previous email says, masters and slaves are in different IaaS. Now the slaves can register to the masters with SSL_ENABLED is on . But I meet another problem. Slaves can’t run container(the odd thing is they can pull image successfully,just can not run container, `docker ps –a ` list nothing) The logs like this: I1029 03:29:45.967741 9288 docker.cpp:758] Starting container 'd4f4e236-0d0a-492c-86df-eef48a414e23' for task '279bcb34-f705-4857-96ad-d96843b848fb.4b3abdcd-7ded-11e5-a82d-0240afabf713' (and executor '279bcb34-f705-4857-96ad-d96843b848fb.4b3abdcd-7ded-11e5-a82d-0240afabf713') of framework '20151029-031549-1294671788-5050-4937-0000' I1029 03:29:48.044148 9292 docker.cpp:382] Checkpointing pid 12062 to '/tmp/mesos/meta/slaves/20151029-031549-1294671788-5050-4937-S0/frameworks/20151029-031549-1294671788-5050-4937-0000/executors/279bcb34-f705-4857-96ad-d96843b848fb.4b3abdcd-7ded-11e5-a82d-0240afabf713/runs/d4f4e236-0d0a-492c-86df-eef48a414e23/pids/forked.pid' I1029 03:29:53.159361 9292 docker.cpp:1576] Executor for container 'd4f4e236-0d0a-492c-86df-eef48a414e23' has exited I1029 03:29:53.159572 9292 docker.cpp:1374] Destroying container 'd4f4e236-0d0a-492c-86df-eef48a414e23' I1029 03:29:53.159822 9292 docker.cpp:1478] Running docker stop on container 'd4f4e236-0d0a-492c-86df-eef48a414e23' I1029 03:29:53.160143 9292 slave.cpp:3399] Executor '279bcb34-f705-4857-96ad-d96843b848fb.4b3abdcd-7ded-11e5-a82d-0240afabf713' of framework 20151029-031549-1294671788-5050-4937-0000 terminated with signal Killed I1029 03:29:53.160884 9292 slave.cpp:2696] Handling status update TASK_FAILED (UUID: 27a2080a-8807-449e-9077-837ec45b4c51) for task 279bcb34-f705-4857-96ad-d96843b848fb.4b3abdcd-7ded-11e5-a82d-0240afabf713 of framework 20151029-031549-1294671788-5050-4937-0000 from @0.0.0.0:0<http://0.0.0.0:0/> W1029 03:29:53.161247 9288 docker.cpp:986] Ignoring updating unknown container: d4f4e236-0d0a-492c-86df-eef48a414e23 I1029 03:29:53.161548 9293 status_update_manager.cpp:322] Received status update TASK_FAILED (UUID: 27a2080a-8807-449e-9077-837ec45b4c51) for task 279bcb34-f705-4857-96ad-d96843b848fb.4b3abdcd-7ded-11e5-a82d-0240afabf713 of framework 20151029-031549-1294671788-5050-4937-0000 I run master node with env: SSL_SUPPORT_DOWNGRADE=true SSL_ENABLED=true SSL_KEY_FILE=/home/ubuntu/xx.key SSL_CERT_FILE=/home/ubuntu/xx.pem Slave node with env: SSL_ENABLED=true SSL_KEY_FILE=/home/ubuntu/xx.key SSL_CERT_FILE=/home/ubuntu/xx.pem LIBPROCESS_ADVERTISE_IP=xxx.xxx.xxx.xxx When I remove all SSL envs. Slaves work well. Did I miss sth? Version: Mesos 0.24.1 Maraton 0.9.2 OS ubuntu 14.04 发件人: Anindya Sinha <anindya.si...@gmail.com<mailto:anindya.si...@gmail.com>> 答复: "user@mesos.apache.org<mailto:user@mesos.apache.org>" <user@mesos.apache.org<mailto:user@mesos.apache.org>> 日期: 2015年10月28日 星期三 下午2:32 至: "user@mesos.apache.org<mailto:user@mesos.apache.org>" <user@mesos.apache.org<mailto:user@mesos.apache.org>> 主题: Re: How to tell master which ip to connect. On Tue, Oct 27, 2015 at 7:43 PM, Xiaodong Zhang <xdzh...@alauda.io<mailto:xdzh...@alauda.io>> wrote: It works! Thanks a lot. Ok. So we should expose advertise_ip and advertise_port as command line options for mesos-slave as well (instead of using the environment variables)? Opened https://issues.apache.org/jira/browse/MESOS-3809. Another question. Do masters and slaves communicate each other via a safety way?Is the data encrypted? I want to make sure deploy masters and slaves into different IaaS is PROD-READY. 发件人: haosdent <haosd...@gmail.com<mailto:haosd...@gmail.com>> 答复: "user@mesos.apache.org<mailto:user@mesos.apache.org>" <user@mesos.apache.org<mailto:user@mesos.apache.org>> 日期: 2015年10月28日 星期三 上午10:23 至: user <user@mesos.apache.org<mailto:user@mesos.apache.org>> 主题: Re: How to tell master which ip to connect. Do you try `export LIBPROCESS_ADVERTISE_IP=xxx` and `LIBPROCESS_ADVERTISE_PORT` when start slave? On Wed, Oct 28, 2015 at 10:16 AM, Xiaodong Zhang <xdzh...@alauda.io<mailto:xdzh...@alauda.io>> wrote: Hi teams: My scenarios is like this: My master nodes were deployed in AWS. My slaves were in AZURE.So they communicate via public ip. I got trouble when slaves try to register to master. Now slaves can get master’s public ip address,and can send register request.But they can only send there private ip to master.(Because they don’t know there public ip,thus they can’t not bind a public ip via —ip flag), thus masters can’t connect slaves.How can the slave to tell master which ip master should connect(I can’t find any flags like —advertise_ip in master). -- Best Regards, Haosdent Huang -- Best Regards, Haosdent Huang -- Best Regards, Haosdent Huang -- Best Regards, Haosdent Huang -- Best Regards, Haosdent Huang <5185_02_07.png><9D46724C-457C-4BE1-B0E4-F57B147F6DC8.png><742629F2-78E8-43F2-9015-F3D22720826B.png><5185_02_04.png>