Thanks Xiaodong. Based on the hypothesis that the container process launched with SSL_ENABLED in environment is the problem, I have created a patch https://reviews.apache.org/r/39818/ <https://reviews.apache.org/r/39818/>. This might be a quick and dirty was to test the hypothesis. Would it be possible for you to test again after applying the patch?
-Jojy > On Oct 30, 2015, at 8:29 AM, Xiaodong Zhang <xdzh...@alauda.io> wrote: > > Thanks @Jojy > > > > Flags at startup: --appc_store_dir="/tmp/mesos/store/appc" > --authenticatee="crammd5" --cgroups_cpu_enable_pids_and_tids_count="false" > --cgroups_enable_cfs="false" --cgroups_hierarchy="/sys/fs/cgroup" > --cgroups_limit_swap="false" --cgroups_root="mesos" > --container_disk_watch_interval="15secs" --containerizers="docker,mesos" > --credential="/etc/mesos-slave-auth" --default_role="*" > --disk_watch_interval="1mins" --docker="/usr/bin/docker" > --docker_kill_orphans="true" --docker_remove_delay="6hrs" > --docker_socket="/var/run/docker.sock" --docker_stop_timeout="0ns" > --enforce_container_disk_quota="false" --executor_registration_timeout="1hrs" > --executor_shutdown_grace_period="5secs" > --fetcher_cache_dir="/tmp/mesos/fetch" --fetcher_cache_size="2GB" > --frameworks_home="" --gc_delay="1weeks" --gc_disk_headroom="0.1" > --hadoop_home="" --help="false" --initialize_driver_logging="true" > --isolation="posix/cpu,posix/mem" --launcher_dir="/usr/libexec/mesos" > --log_dir="/var/log/mesos" --logbufsecs="0" --logging_level="INFO" > --master="zk://172.31.43.77:2181,172.31.44.2:2181,172.31.36.91:2181/mesos" > --oversubscribed_resources_interval="15secs" --perf_duration="10secs" > --perf_interval="1mins" --port="5051" --qos_correction_interval_min="0ns" > --quiet="false" --recover="reconnect" --recovery_timeout="15mins" > --registration_backoff_factor="1secs" --resource_monitoring_interval="1secs" > --revocable_cpu_low_priority="true" --sandbox_directory="/mnt/mesos/sandbox" > --strict="true" --switch_user="true" --version="false" --work_dir="/tmp/mesos" > > 发件人: Jojy Varghese <j...@mesosphere.io <mailto:j...@mesosphere.io>> > 答复: "user@mesos.apache.org <mailto:user@mesos.apache.org>" > <user@mesos.apache.org <mailto:user@mesos.apache.org>> > 日期: 2015年10月30日 星期五 下午11:17 > 至: "user@mesos.apache.org <mailto:user@mesos.apache.org>" > <user@mesos.apache.org <mailto:user@mesos.apache.org>> > 主题: Re: Can't start docker container when SSL_ENABLED is on. > > Hi Xiaodong > This might be because the executor inherits the SSL environment variables > of slave and thus expects SSL key password to launch. Could you please add > the part of the slave logs that says "Flags at startup” so that we can have > more information? > > thanks > Jojy > > >> On Oct 29, 2015, at 8:55 PM, Xiaodong Zhang <xdzh...@alauda.io >> <mailto:xdzh...@alauda.io>> wrote: >> >> Thanks a lot !~ @haosent >> >> 发件人: haosdent <haosd...@gmail.com <mailto:haosd...@gmail.com>> >> 答复: "user@mesos.apache.org <mailto:user@mesos.apache.org>" >> <user@mesos.apache.org <mailto:user@mesos.apache.org>> >> 日期: 2015年10月30日 星期五 上午11:45 >> 至: user <user@mesos.apache.org <mailto:user@mesos.apache.org>> >> 主题: Re: Can't start docker container when SSL_ENABLED is on. >> >> Hi, @Xiaodong I interested in your problem. But recently days I don't have >> enough time to try reproduce your problem. I think I could try to dig your >> problem at this Sunday and give you feedback. >> >> On Fri, Oct 30, 2015 at 11:30 AM, Xiaodong Zhang <xdzh...@alauda.io >> <mailto:xdzh...@alauda.io>> wrote: >>> Anybody know about this? >>> >>> 发件人: Xiaodong Zhang <xdzh...@alauda.io <mailto:xdzh...@alauda.io>> >>> 答复: "user@mesos.apache.org <mailto:user@mesos.apache.org>" >>> <user@mesos.apache.org <mailto:user@mesos.apache.org>> >>> 日期: 2015年10月29日 星期四 下午7:38 >>> >>> 至: "user@mesos.apache.org <mailto:user@mesos.apache.org>" >>> <user@mesos.apache.org <mailto:user@mesos.apache.org>> >>> 主题: Re: Can't start docker container when SSL_ENABLED is on. >>> >>> I think it is easy to reproduce this error. >>> >>> Start master with env: >>> >>> SSL_SUPPORT_DOWNGRADE >>> SSL_ENABLED >>> SSL_KEY_FILE >>> SSL_CERT_FILE >>> >>> Start slave with env: >>> >>> SSL_ENABLED >>> SSL_KEY_FILE >>> SSL_CERT_FILE >>> LIBPROCESS_ADVERTISE_IP >>> >>> >>> Then run a docker task via marathon. >>> >>> 发件人: Xiaodong Zhang <xdzh...@alauda.io <mailto:xdzh...@alauda.io>> >>> 日期: 2015年10月29日 星期四 下午3:09 >>> 至: "user@mesos.apache.org <mailto:user@mesos.apache.org>" >>> <user@mesos.apache.org <mailto:user@mesos.apache.org>> >>> 主题: Re: Can't start docker container when SSL_ENABLED is on. >>> >>> So now, mesos task work well but docker task doesn’t. >>> >>> 发件人: Xiaodong Zhang <xdzh...@alauda.io <mailto:xdzh...@alauda.io>> >>> 答复: "user@mesos.apache.org <mailto:user@mesos.apache.org>" >>> <user@mesos.apache.org <mailto:user@mesos.apache.org>> >>> 日期: 2015年10月29日 星期四 下午2:08 >>> 至: "user@mesos.apache.org <mailto:user@mesos.apache.org>" >>> <user@mesos.apache.org <mailto:user@mesos.apache.org>> >>> 主题: Re: Can't start docker container when SSL_ENABLED is on. >>> >>> I run a task by marathon: >>> >>> { >>> "id": "basic-0", >>> "cmd": "while [ true ] ; do echo 'Hello Marathon' ; sleep 5 ; done", >>> "cpus": 0.1, >>> "mem": 10.0, >>> "instances": 1} >>> >>> It works well. >>> >>> <742629F2-78E8-43F2-9015-F3D22720826B.png> >>> >>> Docker task can pull image but can’t run as I mentioned. >>> >>> My docker version 1.5.0 >>> >>> 发件人: Tim Chen <t...@mesosphere.io <mailto:t...@mesosphere.io>> >>> 答复: "user@mesos.apache.org <mailto:user@mesos.apache.org>" >>> <user@mesos.apache.org <mailto:user@mesos.apache.org>> >>> 日期: 2015年10月29日 星期四 下午1:48 >>> 至: "user@mesos.apache.org <mailto:user@mesos.apache.org>" >>> <user@mesos.apache.org <mailto:user@mesos.apache.org>> >>> 主题: Re: Can't start docker container when SSL_ENABLED is on. >>> >>> Does running a task without docker container (Mesos containerizer) works >>> with ssl in your environment? >>> >>> Tim >>> >>> On Wed, Oct 28, 2015 at 10:19 PM, Xiaodong Zhang <xdzh...@alauda.io >>> <mailto:xdzh...@alauda.io>> wrote: >>>> Thanks a lot. I find the log file in slave. >>>> >>>> One of the task: >>>> >>>> Stdout: >>>> >>>> --container="mesos-20151029-043755-3549436724-5050-5674-S0.e2c2580f-8082-4f17-b0cc-4e32e040d444" >>>> --docker="/home/ubuntu/luna/bin/docker" --help="false" >>>> --initialize_driver_logging="true" --logbufsecs="0" --logging_level="INFO" >>>> --mapped_directory="/mnt/mesos/sandbox" --quiet="false" >>>> --sandbox_directory="/tmp/mesos/slaves/20151029-043755-3549436724-5050-5674-S0/frameworks/20151029-043755-3549436724-5050-5674-0000/executors/e4a3bed5-64e6-4970-8bb1-df6404656a48.e3a20f3b-7dfb-11e5-b57b-0247b493b22f/runs/e2c2580f-8082-4f17-b0cc-4e32e040d444" >>>> --stop_timeout="0ns" >>>> --container="mesos-20151029-043755-3549436724-5050-5674-S0.e2c2580f-8082-4f17-b0cc-4e32e040d444" >>>> --docker="/home/ubuntu/luna/bin/docker" --help="false" >>>> --initialize_driver_logging="true" --logbufsecs="0" --logging_level="INFO" >>>> --mapped_directory="/mnt/mesos/sandbox" --quiet="false" >>>> --sandbox_directory="/tmp/mesos/slaves/20151029-043755-3549436724-5050-5674-S0/frameworks/20151029-043755-3549436724-5050-5674-0000/executors/e4a3bed5-64e6-4970-8bb1-df6404656a48.e3a20f3b-7dfb-11e5-b57b-0247b493b22f/runs/e2c2580f-8082-4f17-b0cc-4e32e040d444" >>>> --stop_timeout="0ns" >>>> Shutting down >>>> >>>> Stderr: >>>> >>>> I1029 05:14:06.529364 27862 fetcher.cpp:414] Fetcher Info: >>>> {"cache_directory":"\/tmp\/mesos\/fetch\/slaves\/20151029-043755-3549436724-5050-5674-S0","items":[{"action":"BYPASS_CACHE","uri":{"extract":false,"value":"file:\/\/\/etc\/.dockercfg"}}],"sandbox_directory":"\/tmp\/mesos\/slaves\/20151029-043755-3549436724-5050-5674-S0\/frameworks\/20151029-043755-3549436724-5050-5674-0000\/executors\/e4a3bed5-64e6-4970-8bb1-df6404656a48.e3a20f3b-7dfb-11e5-b57b-0247b493b22f\/runs\/e2c2580f-8082-4f17-b0cc-4e32e040d444"} >>>> I1029 05:14:06.530562 27862 fetcher.cpp:369] Fetching URI >>>> 'file:///etc/.dockercfg <>' >>>> I1029 05:14:06.530580 27862 fetcher.cpp:243] Fetching directly into the >>>> sandbox directory >>>> I1029 05:14:06.530594 27862 fetcher.cpp:180] Fetching URI >>>> 'file:///etc/.dockercfg <>' >>>> I1029 05:14:06.530609 27862 fetcher.cpp:160] Copying resource with >>>> command:cp '/etc/.dockercfg' >>>> '/tmp/mesos/slaves/20151029-043755-3549436724-5050-5674-S0/frameworks/20151029-043755-3549436724-5050-5674-0000/executors/e4a3bed5-64e6-4970-8bb1-df6404656a48.e3a20f3b-7dfb-11e5-b57b-0247b493b22f/runs/e2c2580f-8082-4f17-b0cc-4e32e040d444/.dockercfg' >>>> I1029 05:14:06.532165 27862 fetcher.cpp:446] Fetched >>>> 'file:///etc/.dockercfg <>' to >>>> '/tmp/mesos/slaves/20151029-043755-3549436724-5050-5674-S0/frameworks/20151029-043755-3549436724-5050-5674-0000/executors/e4a3bed5-64e6-4970-8bb1-df6404656a48.e3a20f3b-7dfb-11e5-b57b-0247b493b22f/runs/e2c2580f-8082-4f17-b0cc-4e32e040d444/.dockercfg' >>>> I1029 05:14:07.782054 27955 exec.cpp:133] Version: 0.24.1 >>>> I1029 05:14:07.785039 27963 exec.cpp:462] Slave exited ... shutting down >>>> E1029 05:14:07.785158 27964 socket.hpp:174] Shutdown failed on fd=7: >>>> Transport endpoint is not connected [107] >>>> >>>> 发件人: haosdent <haosd...@gmail.com <mailto:haosd...@gmail.com>> >>>> 答复: "user@mesos.apache.org <mailto:user@mesos.apache.org>" >>>> <user@mesos.apache.org <mailto:user@mesos.apache.org>> >>>> 日期: 2015年10月29日 星期四 下午1:13 >>>> >>>> 至: user <user@mesos.apache.org <mailto:user@mesos.apache.org>> >>>> 主题: Re: Can't start docker container when SSL_ENABLED is on. >>>> >>>> <5185_02_04.png> >>>> <5185_02_07.png> >>>> >>>> I capture how I find tasks log in my local webui, could you find the >>>> stderr and stdout for your tasks according above screenshots? >>>> >>>> >>>> On Thu, Oct 29, 2015 at 1:07 PM, Xiaodong Zhang <xdzh...@alauda.io >>>> <mailto:xdzh...@alauda.io>> wrote: >>>>> I didn’t see some useful info. >>>>> >>>>> In mesos slave log, there is a line : >>>>> I1029 03:29:53.160143 9292 slave.cpp:3399] Executor >>>>> '279bcb34-f705-4857-96ad-d96843b848fb.4b3abdcd-7ded-11e5-a82d-0240afabf713' >>>>> of framework 20151029-031549-1294671788-5050-4937-0000 terminated with >>>>> signal Killed >>>>> >>>>> I check the normal log, it shows: >>>>> >>>>> I1014 15:22:21.276007 23163 slave.cpp:3326] Executor >>>>> 'ffc08dce-997f-41f7-9b03-57c1b4bc1f85.47ed02aa-7285-11e5-80d7-000d3a8033de' >>>>> of framework 20150814-115157-1677721866-5050-6185-0000 exited with >>>>> status 0 >>>>> >>>>> Is this helpful? >>>>> >>>>> 发件人: Xiaodong Zhang <xdzh...@alauda.io <mailto:xdzh...@alauda.io>> >>>>> 答复: "user@mesos.apache.org <mailto:user@mesos.apache.org>" >>>>> <user@mesos.apache.org <mailto:user@mesos.apache.org>> >>>>> 日期: 2015年10月29日 星期四 下午12:59 >>>>> 至: "user@mesos.apache.org <mailto:user@mesos.apache.org>" >>>>> <user@mesos.apache.org <mailto:user@mesos.apache.org>> >>>>> >>>>> 主题: Re: Can't start docker container when SSL_ENABLED is on. >>>>> >>>>> <9D46724C-457C-4BE1-B0E4-F57B147F6DC8.png> >>>>> >>>>> The webui have a LOG link, when click it shows like this: >>>>> >>>>> I1029 04:44:32.293445 5697 http.cpp:321] HTTP GET for /master/state.json >>>>> from 114.113.20.135:55682 <http://114.113.20.135:55682/> with >>>>> User-Agent='Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_5) >>>>> AppleWebKit/537.36 (KHTML, like Gecko) Chrome/46.0.2490.71 Safari/537.36' >>>>> I1029 04:44:34.533504 5704 master.cpp:4613] Sending 1 offers to >>>>> framework 20151029-043755-3549436724-5050-5674-0000 (marathon) at >>>>> scheduler-b532233f-2fc5-4455-b1e6-7a66ae79a8b9@172.31.43.77 >>>>> <mailto:scheduler-b532233f-2fc5-4455-b1e6-7a66ae79a8b9@172.31.43.77>:53373 >>>>> I1029 04:44:34.539579 5702 master.cpp:2739] Processing ACCEPT call for >>>>> offers: [ 20151029-043755-3549436724-5050-5674-O2 ] on slave >>>>> 20151029-043755-3549436724-5050-5674-S0 at slave(1)@50.112.136.148:5051 >>>>> <http://50.112.136.148:5051/> >>>>> (ec2-50-112-136-148.us-west-2.compute.amazonaws.com >>>>> <http://ec2-50-112-136-148.us-west-2.compute.amazonaws.com/>) for >>>>> framework 20151029-043755-3549436724-5050-5674-0000 (marathon) at >>>>> scheduler-b532233f-2fc5-4455-b1e6-7a66ae79a8b9@172.31.43.77 >>>>> <mailto:scheduler-b532233f-2fc5-4455-b1e6-7a66ae79a8b9@172.31.43.77>:53373 >>>>> I1029 04:44:34.539710 5702 hierarchical.hpp:814] Recovered cpus(*):1; >>>>> mem(*):999; disk(*):3962; ports(*):[31000-32000] (total: cpus(*):1; >>>>> mem(*):999; disk(*):3962; ports(*):[31000-32000], allocated: ) on slave >>>>> 20151029-043755-3549436724-5050-5674-S0 from framework >>>>> 20151029-043755-3549436724-5050-5674-0000 >>>>> I1029 04:44:37.360901 5703 master.cpp:4294] Performing implicit task >>>>> state reconciliation for framework >>>>> 20151029-043755-3549436724-5050-5674-0000 (marathon) at >>>>> scheduler-b532233f-2fc5-4455-b1e6-7a66ae79a8b9@172.31.43.77 >>>>> <mailto:scheduler-b532233f-2fc5-4455-b1e6-7a66ae79a8b9@172.31.43.77>:53373 >>>>> I1029 04:44:40.539989 5704 master.cpp:4613] Sending 1 offers to >>>>> framework 20151029-043755-3549436724-5050-5674-0000 (marathon) at >>>>> scheduler-b532233f-2fc5-4455-b1e6-7a66ae79a8b9@172.31.43.77 >>>>> <mailto:scheduler-b532233f-2fc5-4455-b1e6-7a66ae79a8b9@172.31.43.77>:53373 >>>>> I1029 04:44:40.610321 5702 master.cpp:2739] Processing ACCEPT call for >>>>> offers: [ 20151029-043755-3549436724-5050-5674-O3 ] on slave >>>>> 20151029-043755-3549436724-5050-5674-S0 at slave(1)@50.112.136.148:5051 >>>>> <http://50.112.136.148:5051/> >>>>> (ec2-50-112-136-148.us-west-2.compute.amazonaws.com >>>>> <http://ec2-50-112-136-148.us-west-2.compute.amazonaws.com/>) for >>>>> framework 20151029-043755-3549436724-5050-5674-0000 (marathon) at >>>>> scheduler-b532233f-2fc5-4455-b1e6-7a66ae79a8b9@172.31.43.77 >>>>> <mailto:scheduler-b532233f-2fc5-4455-b1e6-7a66ae79a8b9@172.31.43.77>:53373 >>>>> I1029 04:44:40.610846 5702 master.hpp:170] Adding task >>>>> e4a3bed5-64e6-4970-8bb1-df6404656a48.c4239b84-7df7-11e5-b57b-0247b493b22f >>>>> with resources cpus(*):0.0625; mem(*):256; ports(*):[31864-31864] on >>>>> slave 20151029-043755-3549436724-5050-5674-S0 >>>>> (ec2-50-112-136-148.us-west-2.compute.amazonaws.com >>>>> <http://ec2-50-112-136-148.us-west-2.compute.amazonaws.com/>) >>>>> I1029 04:44:40.610911 5702 master.cpp:3069] Launching task >>>>> e4a3bed5-64e6-4970-8bb1-df6404656a48.c4239b84-7df7-11e5-b57b-0247b493b22f >>>>> of framework 20151029-043755-3549436724-5050-5674-0000 (marathon) at >>>>> scheduler-b532233f-2fc5-4455-b1e6-7a66ae79a8b9@172.31.43.77 >>>>> <mailto:scheduler-b532233f-2fc5-4455-b1e6-7a66ae79a8b9@172.31.43.77>:53373 >>>>> with resources cpus(*):0.0625; mem(*):256; ports(*):[31864-31864] on >>>>> slave 20151029-043755-3549436724-5050-5674-S0 at >>>>> slave(1)@50.112.136.148:5051 <http://50.112.136.148:5051/> >>>>> (ec2-50-112-136-148.us-west-2.compute.amazonaws.com >>>>> <http://ec2-50-112-136-148.us-west-2.compute.amazonaws.com/>) >>>>> I1029 04:44:40.611095 5702 hierarchical.hpp:814] Recovered >>>>> cpus(*):0.9375; mem(*):743; disk(*):3962; ports(*):[31000-31863, >>>>> 31865-32000] (total: cpus(*):1; mem(*):999; disk(*):3962; >>>>> ports(*):[31000-32000], allocated: cpus(*):0.0625; mem(*):256; >>>>> ports(*):[31864-31864]) on slave 20151029-043755-3549436724-5050-5674-S0 >>>>> from framework 20151029-043755-3549436724-5050-5674-0000 >>>>> I1029 04:44:43.324970 5698 http.cpp:321] HTTP GET for /master/state.json >>>>> from 114.113.20.135:55682 <http://114.113.20.135:55682/> with >>>>> User-Agent='Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_5) >>>>> AppleWebKit/537.36 (KHTML, like Gecko) Chrome/46.0.2490.71 Safari/537.36' >>>>> I1029 04:44:46.546671 5703 master.cpp:4613] Sending 1 offers to >>>>> framework 20151029-043755-3549436724-5050-5674-0000 (marathon) at >>>>> scheduler-b532233f-2fc5-4455-b1e6-7a66ae79a8b9@172.31.43.77 >>>>> <mailto:scheduler-b532233f-2fc5-4455-b1e6-7a66ae79a8b9@172.31.43.77>:53373 >>>>> I1029 04:44:46.557266 5699 master.cpp:2739] Processing ACCEPT call for >>>>> offers: [ 20151029-043755-3549436724-5050-5674-O4 ] on slave >>>>> 20151029-043755-3549436724-5050-5674-S0 at slave(1)@50.112.136.148:5051 >>>>> <http://50.112.136.148:5051/> >>>>> (ec2-50-112-136-148.us-west-2.compute.amazonaws.com >>>>> <http://ec2-50-112-136-148.us-west-2.compute.amazonaws.com/>) for >>>>> framework 20151029-043755-3549436724-5050-5674-0000 (marathon) at >>>>> scheduler-b532233f-2fc5-4455-b1e6-7a66ae79a8b9@172.31.43.77 >>>>> <mailto:scheduler-b532233f-2fc5-4455-b1e6-7a66ae79a8b9@172.31.43.77>:53373 >>>>> I1029 04:44:46.557394 5699 hierarchical.hpp:814] Recovered >>>>> cpus(*):0.9375; mem(*):743; disk(*):3962; ports(*):[31000-31863, >>>>> 31865-32000] (total: cpus(*):1; mem(*):999; disk(*):3962; >>>>> ports(*):[31000-32000], allocated: cpus(*):0.0625; mem(*):256; >>>>> ports(*):[31864-31864]) on slave 20151029-043755-3549436724-5050-5674-S0 >>>>> from framework 20151029-043755-3549436724-5050-5674-0000 >>>>> I1029 04:44:47.267562 5700 master.cpp:4069] Status update TASK_FAILED >>>>> (UUID: 0ea607fc-bf24-4bda-b107-55a54aba31cf) for task >>>>> e4a3bed5-64e6-4970-8bb1-df6404656a48.c4239b84-7df7-11e5-b57b-0247b493b22f >>>>> of framework 20151029-043755-3549436724-5050-5674-0000 from slave >>>>> 20151029-043755-3549436724-5050-5674-S0 at slave(1)@50.112.136.148:5051 >>>>> <http://50.112.136.148:5051/> >>>>> (ec2-50-112-136-148.us-west-2.compute.amazonaws.com >>>>> <http://ec2-50-112-136-148.us-west-2.compute.amazonaws.com/>) >>>>> I1029 04:44:47.267645 5700 master.cpp:4108] Forwarding status update >>>>> TASK_FAILED (UUID: 0ea607fc-bf24-4bda-b107-55a54aba31cf) for task >>>>> e4a3bed5-64e6-4970-8bb1-df6404656a48.c4239b84-7df7-11e5-b57b-0247b493b22f >>>>> of framework 20151029-043755-3549436724-5050-5674-0000 >>>>> I1029 04:44:47.267774 5700 master.cpp:5576] Updating the latest state of >>>>> task >>>>> e4a3bed5-64e6-4970-8bb1-df6404656a48.c4239b84-7df7-11e5-b57b-0247b493b22f >>>>> of framework 20151029-043755-3549436724-5050-5674-0000 to TASK_FAILED >>>>> I1029 04:44:47.267907 5700 hierarchical.hpp:814] Recovered >>>>> cpus(*):0.0625; mem(*):256; ports(*):[31864-31864] (total: cpus(*):1; >>>>> mem(*):999; disk(*):3962; ports(*):[31000-32000], allocated: ) on slave >>>>> 20151029-043755-3549436724-5050-5674-S0 from framework >>>>> 20151029-043755-3549436724-5050-5674-0000 >>>>> I1029 04:44:47.289356 5698 master.cpp:5644] Removing task >>>>> e4a3bed5-64e6-4970-8bb1-df6404656a48.c4239b84-7df7-11e5-b57b-0247b493b22f >>>>> with resources cpus(*):0.0625; mem(*):256; ports(*):[31864-31864] of >>>>> framework 20151029-043755-3549436724-5050-5674-0000 on slave >>>>> 20151029-043755-3549436724-5050-5674-S0 at slave(1)@50.112.136.148:5051 >>>>> <http://50.112.136.148:5051/> >>>>> (ec2-50-112-136-148.us-west-2.compute.amazonaws.com >>>>> <http://ec2-50-112-136-148.us-west-2.compute.amazonaws.com/>) >>>>> I1029 04:44:47.289459 5698 master.cpp:3398] Processing ACKNOWLEDGE call >>>>> 0ea607fc-bf24-4bda-b107-55a54aba31cf for task >>>>> e4a3bed5-64e6-4970-8bb1-df6404656a48.c4239b84-7df7-11e5-b57b-0247b493b22f >>>>> of framework 20151029-043755-3549436724-5050-5674-0000 (marathon) at >>>>> scheduler-b532233f-2fc5-4455-b1e6-7a66ae79a8b9@172.31.43.77 >>>>> <mailto:scheduler-b532233f-2fc5-4455-b1e6-7a66ae79a8b9@172.31.43.77>:53373 >>>>> on slave 20151029-043755-3549436724-5050-5674-S0 >>>>> >>>>> >>>>> >>>>> 发件人: haosdent <haosd...@gmail.com <mailto:haosd...@gmail.com>> >>>>> 答复: "user@mesos.apache.org <mailto:user@mesos.apache.org>" >>>>> <user@mesos.apache.org <mailto:user@mesos.apache.org>> >>>>> 日期: 2015年10月29日 星期四 下午12:02 >>>>> 至: user <user@mesos.apache.org <mailto:user@mesos.apache.org>> >>>>> 主题: Re: Can't start docker container when SSL_ENABLED is on. >>>>> >>>>> Oh, I mean you task logs. They could be get from Mesos webui. >>>>> >>>>> On Thu, Oct 29, 2015 at 11:52 AM, Xiaodong Zhang <xdzh...@alauda.io >>>>> <mailto:xdzh...@alauda.io>> wrote: >>>>>> Thanks for your reply. >>>>>> >>>>>> Yes I build mesos with `--enable-libevent --enable-ssl`. If I don’t >>>>>> provide key and pem when start slave, it will register fail(That means >>>>>> the ssl work well right?) >>>>>> >>>>>> As I said the odd thing is the container nerver run(`docker ps –a show >>>>>> nothing`). So it can’t have any stdout or stderr. >>>>>> >>>>>> 发件人: haosdent <haosd...@gmail.com <mailto:haosd...@gmail.com>> >>>>>> 答复: "user@mesos.apache.org <mailto:user@mesos.apache.org>" >>>>>> <user@mesos.apache.org <mailto:user@mesos.apache.org>> >>>>>> 日期: 2015年10月29日 星期四 上午11:47 >>>>>> 至: user <user@mesos.apache.org <mailto:user@mesos.apache.org>> >>>>>> 主题: Re: Can't start docker container when SSL_ENABLED is on. >>>>>> >>>>>> Do you compile mesos with ssl support? The default compile don't >>>>>> contains ssl. And does docker container have stdour and stderr? >>>>>> >>>>>> On Thu, Oct 29, 2015 at 11:41 AM, Xiaodong Zhang <xdzh...@alauda.io >>>>>> <mailto:xdzh...@alauda.io>> wrote: >>>>>>> My scenarios is like previous email says, masters and slaves are in >>>>>>> different IaaS. Now the slaves can register to the masters with >>>>>>> SSL_ENABLED is on . >>>>>>> >>>>>>> But I meet another problem. Slaves can’t run container(the odd thing is >>>>>>> they can pull image successfully,just can not run container, `docker ps >>>>>>> –a ` list nothing) >>>>>>> >>>>>>> The logs like this: >>>>>>> >>>>>>> I1029 03:29:45.967741 9288 docker.cpp:758] Starting container >>>>>>> 'd4f4e236-0d0a-492c-86df-eef48a414e23' for task >>>>>>> '279bcb34-f705-4857-96ad-d96843b848fb.4b3abdcd-7ded-11e5-a82d-0240afabf713' >>>>>>> (and executor >>>>>>> '279bcb34-f705-4857-96ad-d96843b848fb.4b3abdcd-7ded-11e5-a82d-0240afabf713') >>>>>>> of framework '20151029-031549-1294671788-5050-4937-0000' >>>>>>> I1029 03:29:48.044148 9292 docker.cpp:382] Checkpointing pid 12062 to >>>>>>> '/tmp/mesos/meta/slaves/20151029-031549-1294671788-5050-4937-S0/frameworks/20151029-031549-1294671788-5050-4937-0000/executors/279bcb34-f705-4857-96ad-d96843b848fb.4b3abdcd-7ded-11e5-a82d-0240afabf713/runs/d4f4e236-0d0a-492c-86df-eef48a414e23/pids/forked.pid' >>>>>>> I1029 03:29:53.159361 9292 docker.cpp:1576] Executor for container >>>>>>> 'd4f4e236-0d0a-492c-86df-eef48a414e23' has exited >>>>>>> I1029 03:29:53.159572 9292 docker.cpp:1374] Destroying container >>>>>>> 'd4f4e236-0d0a-492c-86df-eef48a414e23' >>>>>>> I1029 03:29:53.159822 9292 docker.cpp:1478] Running docker stop on >>>>>>> container 'd4f4e236-0d0a-492c-86df-eef48a414e23' >>>>>>> I1029 03:29:53.160143 9292 slave.cpp:3399] Executor >>>>>>> '279bcb34-f705-4857-96ad-d96843b848fb.4b3abdcd-7ded-11e5-a82d-0240afabf713' >>>>>>> of framework 20151029-031549-1294671788-5050-4937-0000 terminated with >>>>>>> signal Killed >>>>>>> I1029 03:29:53.160884 9292 slave.cpp:2696] Handling status update >>>>>>> TASK_FAILED (UUID: 27a2080a-8807-449e-9077-837ec45b4c51) for task >>>>>>> 279bcb34-f705-4857-96ad-d96843b848fb.4b3abdcd-7ded-11e5-a82d-0240afabf713 >>>>>>> of framework 20151029-031549-1294671788-5050-4937-0000 from @0.0.0.0:0 >>>>>>> <http://0.0.0.0:0/> >>>>>>> W1029 03:29:53.161247 9288 docker.cpp:986] Ignoring updating unknown >>>>>>> container: d4f4e236-0d0a-492c-86df-eef48a414e23 >>>>>>> I1029 03:29:53.161548 9293 status_update_manager.cpp:322] Received >>>>>>> status update TASK_FAILED (UUID: 27a2080a-8807-449e-9077-837ec45b4c51) >>>>>>> for task >>>>>>> 279bcb34-f705-4857-96ad-d96843b848fb.4b3abdcd-7ded-11e5-a82d-0240afabf713 >>>>>>> of framework 20151029-031549-1294671788-5050-4937-0000 >>>>>>> >>>>>>> I run master node with env: >>>>>>> >>>>>>> SSL_SUPPORT_DOWNGRADE=true >>>>>>> SSL_ENABLED=true >>>>>>> SSL_KEY_FILE=/home/ubuntu/xx.key >>>>>>> SSL_CERT_FILE=/home/ubuntu/xx.pem >>>>>>> >>>>>>> Slave node with env: >>>>>>> >>>>>>> SSL_ENABLED=true >>>>>>> SSL_KEY_FILE=/home/ubuntu/xx.key >>>>>>> SSL_CERT_FILE=/home/ubuntu/xx.pem >>>>>>> LIBPROCESS_ADVERTISE_IP=xxx.xxx.xxx.xxx >>>>>>> >>>>>>> When I remove all SSL envs. Slaves work well. >>>>>>> >>>>>>> Did I miss sth? >>>>>>> >>>>>>> Version: >>>>>>> >>>>>>> Mesos 0.24.1 >>>>>>> Maraton 0.9.2 >>>>>>> >>>>>>> OS >>>>>>> ubuntu 14.04 >>>>>>> >>>>>>> >>>>>>> >>>>>>> 发件人: Anindya Sinha <anindya.si...@gmail.com >>>>>>> <mailto:anindya.si...@gmail.com>> >>>>>>> 答复: "user@mesos.apache.org <mailto:user@mesos.apache.org>" >>>>>>> <user@mesos.apache.org <mailto:user@mesos.apache.org>> >>>>>>> 日期: 2015年10月28日 星期三 下午2:32 >>>>>>> 至: "user@mesos.apache.org <mailto:user@mesos.apache.org>" >>>>>>> <user@mesos.apache.org <mailto:user@mesos.apache.org>> >>>>>>> 主题: Re: How to tell master which ip to connect. >>>>>>> >>>>>>> >>>>>>> >>>>>>> On Tue, Oct 27, 2015 at 7:43 PM, Xiaodong Zhang <xdzh...@alauda.io >>>>>>> <mailto:xdzh...@alauda.io>> wrote: >>>>>>>> It works! Thanks a lot. >>>>>>> >>>>>>> Ok. So we should expose advertise_ip and advertise_port as command line >>>>>>> options for mesos-slave as well (instead of using the environment >>>>>>> variables)? Opened https://issues.apache.org/jira/browse/MESOS-3809 >>>>>>> <https://issues.apache.org/jira/browse/MESOS-3809>. >>>>>>> >>>>>>>> >>>>>>>> Another question. Do masters and slaves communicate each other via a >>>>>>>> safety way?Is the data encrypted? I want to make sure deploy masters >>>>>>>> and slaves into different IaaS is PROD-READY. >>>>>>>> >>>>>>>> 发件人: haosdent <haosd...@gmail.com <mailto:haosd...@gmail.com>> >>>>>>>> 答复: "user@mesos.apache.org <mailto:user@mesos.apache.org>" >>>>>>>> <user@mesos.apache.org <mailto:user@mesos.apache.org>> >>>>>>>> 日期: 2015年10月28日 星期三 上午10:23 >>>>>>>> 至: user <user@mesos.apache.org <mailto:user@mesos.apache.org>> >>>>>>>> 主题: Re: How to tell master which ip to connect. >>>>>>>> >>>>>>>> Do you try `export LIBPROCESS_ADVERTISE_IP=xxx` and >>>>>>>> `LIBPROCESS_ADVERTISE_PORT` when start slave? >>>>>>>> >>>>>>>> On Wed, Oct 28, 2015 at 10:16 AM, Xiaodong Zhang <xdzh...@alauda.io >>>>>>>> <mailto:xdzh...@alauda.io>> wrote: >>>>>>>>> Hi teams: >>>>>>>>> >>>>>>>>> My scenarios is like this: >>>>>>>>> >>>>>>>>> My master nodes were deployed in AWS. My slaves were in AZURE.So they >>>>>>>>> communicate via public ip. >>>>>>>>> I got trouble when slaves try to register to master. >>>>>>>>> Now slaves can get master’s public ip address,and can send register >>>>>>>>> request.But they can only send there private ip to master.(Because >>>>>>>>> they don’t know there public ip,thus they can’t not bind a public ip >>>>>>>>> via —ip flag), thus masters can’t connect slaves.How can the slave >>>>>>>>> to tell master which ip master should connect(I can’t find any flags >>>>>>>>> like —advertise_ip in master). >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> Best Regards, >>>>>>>> Haosdent Huang >>>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Best Regards, >>>>>> Haosdent Huang >>>>> >>>>> >>>>> >>>>> -- >>>>> Best Regards, >>>>> Haosdent Huang >>>> >>>> >>>> >>>> -- >>>> Best Regards, >>>> Haosdent Huang >>> >> >> >> >> -- >> Best Regards, >> Haosdent Huang >> <5185_02_07.png><9D46724C-457C-4BE1-B0E4-F57B147F6DC8.png><742629F2-78E8-43F2-9015-F3D22720826B.png><5185_02_04.png> >