``` stderr: Could not load cert file ``` Does this because your path is wrong? Generally, executor_environment_variables should be OK.
On Mon, Nov 2, 2015 at 5:15 PM, Xiaodong Zhang <[email protected]> wrote: > Hi, haosdent. > > 1、command line arguments works not well. > > Command: > > /usr/sbin/mesos-slave --master=zk://xxx/mesos --log_dir=/var/log/mesos > --containerizers=docker,mesos --credential=/etc/mesos-slave-auth > --docker=/usr/bin/docker --executor_environment_variables={"SSL_KEY_FILE": > "/home/ubuntu/cert/xxx.pem", "SSL_CERT_FILE": "/home/ubuntu/cert/xxx.key", > "SSL_ENABLED": "true"} --executor_registration_timeout=60mins > > env without ssl > > Error info: > > stderr: > Could not load cert file > > Stdout: > --container="mesos-20151102-085117-3565115700-5050-25211-S1.2b784e8d-0bdd-4ffa-a7db-b6dcf35f0a03" > --docker="/usr/bin/docker" --help="false" > --initialize_driver_logging="true" --logbufsecs="0" --logging_level="INFO" > --mapped_directory="/mnt/mesos/sandbox" --quiet="false" > --sandbox_directory="/tmp/mesos/slaves/20151102-085117-3565115700-5050-25211-S1/frameworks/20151102-085117-3565115700-5050-25211-0000/executors/c310fa88-af8e-4fdd-92b6-eabf372bd187.85ff0237-8140-11e5-a875-021121f8fdf7/runs/2b784e8d-0bdd-4ffa-a7db-b6dcf35f0a03" > --stop_timeout=“0ns" > > > 2、the patch works well.(thanks again) > > 1 and 2 read the same cert file. > > The format of the cert file like this: > > -----BEGIN CERTIFICATE----- > Xxxxxx > -----END CERTIFICATE—— > > 发件人: Xiaodong Zhang <[email protected]> > 日期: 2015年11月2日 星期一 上午11:22 > > 至: "[email protected]" <[email protected]> > 主题: Re: Can't start docker container when SSL_ENABLED is on. > > Thanks@haosdent > > I will test the command line arguments and then test patch. > > Have a nice day!~~ > > 发件人: haosdent <[email protected]> > 答复: "[email protected]" <[email protected]> > 日期: 2015年11月1日 星期日 下午5:40 > 至: user <[email protected]> > 主题: Re: Can't start docker container when SSL_ENABLED is on. > > @Xiaodong I create a ticket to trace this > https://issues.apache.org/jira/browse/MESOS-3815 and post a patch in it. > Feel free to review and test it together. Thank you! > > On Sun, Nov 1, 2015 at 4:54 PM, haosdent <[email protected]> wrote: > >> Hi, @Xiaodong I could reproduce your problem in my testing today. A >> quickly workaround is adding environment variables when you launch slave. >> >> ``` >> ./bin/mesos-slave.sh xxxx --containerizers=docker,mesos >> --executor_environment_variables='{"SSL_KEY_FILE": "/tmp/server.key", >> "SSL_CERT_FILE": "/tmp/ssl.chain.crt", "SSL_ENABLED": "true"}'' >> ``` >> >> As you see above, pass the ssl env to docker-executor through specifying >> --executor_environment_variables when starting. So far it works well for >> me. Anyway I would submit a patch later to fix the docker environment >> variables passing. After that, you could launch slave without >> executor_environment_variables flag. >> >> On Sat, Oct 31, 2015 at 2:56 PM, Tim Chen <[email protected]> wrote: >> >>> Hi Xiaodong, >>> >>> If you follow the reviewboard you'll see that the fix is not correct, I >>> believe Jojy will be posting a new patch. >>> >>> Tim >>> >>> On Fri, Oct 30, 2015 at 6:58 PM, Xiaodong Zhang <[email protected]> >>> wrote: >>> >>>> it is still not working! >>>> >>>> Only if I remove SSL_ENABLED from envs before I start the slave it >>>> works well. >>>> >>>> I applied the patch in version 0.24.1. And rebuild it with >>>> `--enable-libevent >>>> --enable-ssl` 。 >>>> >>>> 发件人: Xiaodong Zhang <[email protected]> >>>> 日期: 2015年10月31日 星期六 上午7:45 >>>> >>>> 至: "[email protected]" <[email protected]> >>>> 主题: Re: Can't start docker container when SSL_ENABLED is on. >>>> >>>> Thanks Jojy. >>>> >>>> I will patch this in version 0.24.1, and rebuild it. I will let you >>>> know if it work well after I finish testing. >>>> >>>> 发件人: Jojy Varghese <[email protected]> >>>> 答复: "[email protected]" <[email protected]> >>>> 日期: 2015年10月31日 星期六 上午12:45 >>>> 至: "[email protected]" <[email protected]> >>>> 主题: Re: Can't start docker container when SSL_ENABLED is on. >>>> >>>> Thanks Xiaodong. >>>> >>>> Based on the hypothesis that the container process launched with >>>> SSL_ENABLED in environment is the problem, I have created a patch >>>> https://reviews.apache.org/r/39818/. This might be a quick and dirty >>>> was to test the hypothesis. Would it be possible for you to test again >>>> after applying the patch? >>>> >>>> -Jojy >>>> >>>> >>>> >>>> On Oct 30, 2015, at 8:29 AM, Xiaodong Zhang <[email protected]> wrote: >>>> >>>> Thanks @Jojy >>>> >>>> >>>> >>>> Flags at startup: --appc_store_dir="/tmp/mesos/store/appc" >>>> --authenticatee="crammd5" --cgroups_cpu_enable_pids_and_tids_count="false" >>>> --cgroups_enable_cfs="false" --cgroups_hierarchy="/sys/fs/cgroup" >>>> --cgroups_limit_swap="false" --cgroups_root="mesos" >>>> --container_disk_watch_interval="15secs" --containerizers="docker,mesos" >>>> --credential="/etc/mesos-slave-auth" --default_role="*" >>>> --disk_watch_interval="1mins" --docker="/usr/bin/docker" >>>> --docker_kill_orphans="true" --docker_remove_delay="6hrs" >>>> --docker_socket="/var/run/docker.sock" --docker_stop_timeout="0ns" >>>> --enforce_container_disk_quota="false" >>>> --executor_registration_timeout="1hrs" >>>> --executor_shutdown_grace_period="5secs" >>>> --fetcher_cache_dir="/tmp/mesos/fetch" --fetcher_cache_size="2GB" >>>> --frameworks_home="" --gc_delay="1weeks" --gc_disk_headroom="0.1" >>>> --hadoop_home="" --help="false" --initialize_driver_logging="true" >>>> --isolation="posix/cpu,posix/mem" --launcher_dir="/usr/libexec/mesos" >>>> --log_dir="/var/log/mesos" --logbufsecs="0" --logging_level="INFO" >>>> --master=" >>>> zk://172.31.43.77:2181,172.31.44.2:2181,172.31.36.91:2181/mesos" >>>> --oversubscribed_resources_interval="15secs" --perf_duration="10secs" >>>> --perf_interval="1mins" --port="5051" --qos_correction_interval_min="0ns" >>>> --quiet="false" --recover="reconnect" --recovery_timeout="15mins" >>>> --registration_backoff_factor="1secs" >>>> --resource_monitoring_interval="1secs" --revocable_cpu_low_priority="true" >>>> --sandbox_directory="/mnt/mesos/sandbox" --strict="true" >>>> --switch_user="true" --version="false" --work_dir="/tmp/mesos" >>>> >>>> 发件人: Jojy Varghese <[email protected]> >>>> 答复: "[email protected]" <[email protected]> >>>> 日期: 2015年10月30日 星期五 下午11:17 >>>> 至: "[email protected]" <[email protected]> >>>> 主题: Re: Can't start docker container when SSL_ENABLED is on. >>>> >>>> Hi Xiaodong >>>> This might be because the executor inherits the SSL environment >>>> variables of slave and thus expects SSL key password to launch. Could you >>>> please add the part of the slave logs that says "Flags at startup” so that >>>> we can have more information? >>>> >>>> thanks >>>> Jojy >>>> >>>> >>>> On Oct 29, 2015, at 8:55 PM, Xiaodong Zhang <[email protected]> wrote: >>>> >>>> Thanks a lot !~ @haosent >>>> >>>> 发件人: haosdent <[email protected]> >>>> 答复: "[email protected]" <[email protected]> >>>> 日期: 2015年10月30日 星期五 上午11:45 >>>> 至: user <[email protected]> >>>> 主题: Re: Can't start docker container when SSL_ENABLED is on. >>>> >>>> Hi, @Xiaodong I interested in your problem. But recently days I don't >>>> have enough time to try reproduce your problem. I think I could try to dig >>>> your problem at this Sunday and give you feedback. >>>> >>>> On Fri, Oct 30, 2015 at 11:30 AM, Xiaodong Zhang <[email protected]> >>>> wrote: >>>> >>>>> Anybody know about this? >>>>> >>>>> 发件人: Xiaodong Zhang <[email protected]> >>>>> 答复: "[email protected]" <[email protected]> >>>>> 日期: 2015年10月29日 星期四 下午7:38 >>>>> >>>>> 至: "[email protected]" <[email protected]> >>>>> 主题: Re: Can't start docker container when SSL_ENABLED is on. >>>>> >>>>> I think it is easy to reproduce this error. >>>>> >>>>> Start master with env: >>>>> >>>>> SSL_SUPPORT_DOWNGRADE >>>>> SSL_ENABLED >>>>> SSL_KEY_FILE >>>>> SSL_CERT_FILE >>>>> >>>>> Start slave with env: >>>>> >>>>> SSL_ENABLED >>>>> SSL_KEY_FILE >>>>> SSL_CERT_FILE >>>>> LIBPROCESS_ADVERTISE_IP >>>>> >>>>> >>>>> Then run a docker task via marathon. >>>>> >>>>> 发件人: Xiaodong Zhang <[email protected]> >>>>> 日期: 2015年10月29日 星期四 下午3:09 >>>>> 至: "[email protected]" <[email protected]> >>>>> 主题: Re: Can't start docker container when SSL_ENABLED is on. >>>>> >>>>> So now, mesos task work well but docker task doesn’t. >>>>> >>>>> 发件人: Xiaodong Zhang <[email protected]> >>>>> 答复: "[email protected]" <[email protected]> >>>>> 日期: 2015年10月29日 星期四 下午2:08 >>>>> 至: "[email protected]" <[email protected]> >>>>> 主题: Re: Can't start docker container when SSL_ENABLED is on. >>>>> >>>>> I run a task by marathon: >>>>> >>>>> { >>>>> "id": "basic-0", >>>>> "cmd": "while [ true ] ; do echo 'Hello Marathon' ; sleep 5 ; done", >>>>> "cpus": 0.1, >>>>> "mem": 10.0, >>>>> "instances": 1} >>>>> >>>>> >>>>> It works well. >>>>> >>>>> <742629F2-78E8-43F2-9015-F3D22720826B.png> >>>>> >>>>> Docker task can pull image but can’t run as I mentioned. >>>>> >>>>> My docker version 1.5.0 >>>>> >>>>> 发件人: Tim Chen <[email protected]> >>>>> 答复: "[email protected]" <[email protected]> >>>>> 日期: 2015年10月29日 星期四 下午1:48 >>>>> 至: "[email protected]" <[email protected]> >>>>> 主题: Re: Can't start docker container when SSL_ENABLED is on. >>>>> >>>>> Does running a task without docker container (Mesos containerizer) >>>>> works with ssl in your environment? >>>>> >>>>> Tim >>>>> >>>>> On Wed, Oct 28, 2015 at 10:19 PM, Xiaodong Zhang <[email protected]> >>>>> wrote: >>>>> >>>>>> Thanks a lot. I find the log file in slave. >>>>>> >>>>>> One of the task: >>>>>> >>>>>> Stdout: >>>>>> >>>>>> --container="mesos-20151029-043755-3549436724-5050-5674-S0.e2c2580f-8082-4f17-b0cc-4e32e040d444" >>>>>> --docker="/home/ubuntu/luna/bin/docker" --help="false" >>>>>> --initialize_driver_logging="true" --logbufsecs="0" >>>>>> --logging_level="INFO" >>>>>> --mapped_directory="/mnt/mesos/sandbox" --quiet="false" >>>>>> --sandbox_directory="/tmp/mesos/slaves/20151029-043755-3549436724-5050-5674-S0/frameworks/20151029-043755-3549436724-5050-5674-0000/executors/e4a3bed5-64e6-4970-8bb1-df6404656a48.e3a20f3b-7dfb-11e5-b57b-0247b493b22f/runs/e2c2580f-8082-4f17-b0cc-4e32e040d444" >>>>>> --stop_timeout="0ns" >>>>>> --container="mesos-20151029-043755-3549436724-5050-5674-S0.e2c2580f-8082-4f17-b0cc-4e32e040d444" >>>>>> --docker="/home/ubuntu/luna/bin/docker" --help="false" >>>>>> --initialize_driver_logging="true" --logbufsecs="0" >>>>>> --logging_level="INFO" >>>>>> --mapped_directory="/mnt/mesos/sandbox" --quiet="false" >>>>>> --sandbox_directory="/tmp/mesos/slaves/20151029-043755-3549436724-5050-5674-S0/frameworks/20151029-043755-3549436724-5050-5674-0000/executors/e4a3bed5-64e6-4970-8bb1-df6404656a48.e3a20f3b-7dfb-11e5-b57b-0247b493b22f/runs/e2c2580f-8082-4f17-b0cc-4e32e040d444" >>>>>> --stop_timeout="0ns" >>>>>> Shutting down >>>>>> >>>>>> Stderr: >>>>>> >>>>>> I1029 05:14:06.529364 27862 fetcher.cpp:414] Fetcher Info: >>>>>> {"cache_directory":"\/tmp\/mesos\/fetch\/slaves\/20151029-043755-3549436724-5050-5674-S0","items":[{"action":"BYPASS_CACHE","uri":{"extract":false,"value":"file:\/\/\/etc\/.dockercfg"}}],"sandbox_directory":"\/tmp\/mesos\/slaves\/20151029-043755-3549436724-5050-5674-S0\/frameworks\/20151029-043755-3549436724-5050-5674-0000\/executors\/e4a3bed5-64e6-4970-8bb1-df6404656a48.e3a20f3b-7dfb-11e5-b57b-0247b493b22f\/runs\/e2c2580f-8082-4f17-b0cc-4e32e040d444"} >>>>>> I1029 05:14:06.530562 27862 fetcher.cpp:369] Fetching URI ' >>>>>> file:///etc/.dockercfg' >>>>>> I1029 05:14:06.530580 27862 fetcher.cpp:243] Fetching directly into >>>>>> the sandbox directory >>>>>> I1029 05:14:06.530594 27862 fetcher.cpp:180] Fetching URI ' >>>>>> file:///etc/.dockercfg' >>>>>> I1029 05:14:06.530609 27862 fetcher.cpp:160] Copying resource with >>>>>> command:cp '/etc/.dockercfg' >>>>>> '/tmp/mesos/slaves/20151029-043755-3549436724-5050-5674-S0/frameworks/20151029-043755-3549436724-5050-5674-0000/executors/e4a3bed5-64e6-4970-8bb1-df6404656a48.e3a20f3b-7dfb-11e5-b57b-0247b493b22f/runs/e2c2580f-8082-4f17-b0cc-4e32e040d444/.dockercfg' >>>>>> I1029 05:14:06.532165 27862 fetcher.cpp:446] Fetched ' >>>>>> file:///etc/.dockercfg' to >>>>>> '/tmp/mesos/slaves/20151029-043755-3549436724-5050-5674-S0/frameworks/20151029-043755-3549436724-5050-5674-0000/executors/e4a3bed5-64e6-4970-8bb1-df6404656a48.e3a20f3b-7dfb-11e5-b57b-0247b493b22f/runs/e2c2580f-8082-4f17-b0cc-4e32e040d444/.dockercfg' >>>>>> I1029 05:14:07.782054 27955 exec.cpp:133] Version: 0.24.1 >>>>>> I1029 05:14:07.785039 27963 exec.cpp:462] Slave exited ... shutting >>>>>> down >>>>>> E1029 05:14:07.785158 27964 socket.hpp:174] Shutdown failed on fd=7: >>>>>> Transport endpoint is not connected [107] >>>>>> >>>>>> 发件人: haosdent <[email protected]> >>>>>> 答复: "[email protected]" <[email protected]> >>>>>> 日期: 2015年10月29日 星期四 下午1:13 >>>>>> >>>>>> 至: user <[email protected]> >>>>>> 主题: Re: Can't start docker container when SSL_ENABLED is on. >>>>>> >>>>>> <5185_02_04.png> >>>>>> <5185_02_07.png> >>>>>> >>>>>> I capture how I find tasks log in my local webui, could you find the >>>>>> stderr and stdout for your tasks according above screenshots? >>>>>> >>>>>> >>>>>> On Thu, Oct 29, 2015 at 1:07 PM, Xiaodong Zhang <[email protected]> >>>>>> wrote: >>>>>> >>>>>>> I didn’t see some useful info. >>>>>>> >>>>>>> In mesos slave log, there is a line : >>>>>>> I1029 03:29:53.160143 9292 slave.cpp:3399] Executor >>>>>>> '279bcb34-f705-4857-96ad-d96843b848fb.4b3abdcd-7ded-11e5-a82d-0240afabf713' >>>>>>> of framework 20151029-031549-1294671788-5050-4937-0000 terminated >>>>>>> with signal Killed >>>>>>> >>>>>>> I check the normal log, it shows: >>>>>>> >>>>>>> I1014 15:22:21.276007 23163 slave.cpp:3326] Executor >>>>>>> 'ffc08dce-997f-41f7-9b03-57c1b4bc1f85.47ed02aa-7285-11e5-80d7-000d3a8033de' >>>>>>> of framework 20150814-115157-1677721866-5050-6185-0000 exited with >>>>>>> status 0 >>>>>>> >>>>>>> Is this helpful? >>>>>>> >>>>>>> 发件人: Xiaodong Zhang <[email protected]> >>>>>>> 答复: "[email protected]" <[email protected]> >>>>>>> 日期: 2015年10月29日 星期四 下午12:59 >>>>>>> 至: "[email protected]" <[email protected]> >>>>>>> >>>>>>> 主题: Re: Can't start docker container when SSL_ENABLED is on. >>>>>>> >>>>>>> <9D46724C-457C-4BE1-B0E4-F57B147F6DC8.png> >>>>>>> >>>>>>> The webui have a LOG link, when click it shows like this: >>>>>>> >>>>>>> I1029 04:44:32.293445 5697 http.cpp:321] HTTP GET for >>>>>>> /master/state.json from 114.113.20.135:55682 with >>>>>>> User-Agent='Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_5) >>>>>>> AppleWebKit/537.36 (KHTML, like Gecko) Chrome/46.0.2490.71 >>>>>>> Safari/537.36' >>>>>>> I1029 04:44:34.533504 5704 master.cpp:4613] Sending 1 offers to >>>>>>> framework 20151029-043755-3549436724-5050-5674-0000 (marathon) at >>>>>>> [email protected]:53373 >>>>>>> I1029 04:44:34.539579 5702 master.cpp:2739] Processing ACCEPT call >>>>>>> for offers: [ 20151029-043755-3549436724-5050-5674-O2 ] on slave >>>>>>> 20151029-043755-3549436724-5050-5674-S0 at slave(1)@ >>>>>>> 50.112.136.148:5051 ( >>>>>>> ec2-50-112-136-148.us-west-2.compute.amazonaws.com) for framework >>>>>>> 20151029-043755-3549436724-5050-5674-0000 (marathon) at >>>>>>> [email protected]:53373 >>>>>>> I1029 04:44:34.539710 5702 hierarchical.hpp:814] Recovered >>>>>>> cpus(*):1; mem(*):999; disk(*):3962; ports(*):[31000-32000] (total: >>>>>>> cpus(*):1; mem(*):999; disk(*):3962; ports(*):[31000-32000], allocated: >>>>>>> ) >>>>>>> on slave 20151029-043755-3549436724-5050-5674-S0 from framework >>>>>>> 20151029-043755-3549436724-5050-5674-0000 >>>>>>> I1029 04:44:37.360901 5703 master.cpp:4294] Performing implicit >>>>>>> task state reconciliation for framework >>>>>>> 20151029-043755-3549436724-5050-5674-0000 (marathon) at >>>>>>> [email protected]:53373 >>>>>>> I1029 04:44:40.539989 5704 master.cpp:4613] Sending 1 offers to >>>>>>> framework 20151029-043755-3549436724-5050-5674-0000 (marathon) at >>>>>>> [email protected]:53373 >>>>>>> I1029 04:44:40.610321 5702 master.cpp:2739] Processing ACCEPT call >>>>>>> for offers: [ 20151029-043755-3549436724-5050-5674-O3 ] on slave >>>>>>> 20151029-043755-3549436724-5050-5674-S0 at slave(1)@ >>>>>>> 50.112.136.148:5051 ( >>>>>>> ec2-50-112-136-148.us-west-2.compute.amazonaws.com) for framework >>>>>>> 20151029-043755-3549436724-5050-5674-0000 (marathon) at >>>>>>> [email protected]:53373 >>>>>>> I1029 04:44:40.610846 5702 master.hpp:170] Adding task >>>>>>> e4a3bed5-64e6-4970-8bb1-df6404656a48.c4239b84-7df7-11e5-b57b-0247b493b22f >>>>>>> with resources cpus(*):0.0625; mem(*):256; ports(*):[31864-31864] on >>>>>>> slave >>>>>>> 20151029-043755-3549436724-5050-5674-S0 ( >>>>>>> ec2-50-112-136-148.us-west-2.compute.amazonaws.com) >>>>>>> I1029 04:44:40.610911 5702 master.cpp:3069] Launching task >>>>>>> e4a3bed5-64e6-4970-8bb1-df6404656a48.c4239b84-7df7-11e5-b57b-0247b493b22f >>>>>>> of framework 20151029-043755-3549436724-5050-5674-0000 (marathon) at >>>>>>> [email protected]:53373 >>>>>>> with resources cpus(*):0.0625; mem(*):256; ports(*):[31864-31864] on >>>>>>> slave >>>>>>> 20151029-043755-3549436724-5050-5674-S0 at slave(1)@ >>>>>>> 50.112.136.148:5051 ( >>>>>>> ec2-50-112-136-148.us-west-2.compute.amazonaws.com) >>>>>>> I1029 04:44:40.611095 5702 hierarchical.hpp:814] Recovered >>>>>>> cpus(*):0.9375; mem(*):743; disk(*):3962; ports(*):[31000-31863, >>>>>>> 31865-32000] (total: cpus(*):1; mem(*):999; disk(*):3962; >>>>>>> ports(*):[31000-32000], allocated: cpus(*):0.0625; mem(*):256; >>>>>>> ports(*):[31864-31864]) on slave 20151029-043755-3549436724-5050-5674-S0 >>>>>>> from framework 20151029-043755-3549436724-5050-5674-0000 >>>>>>> I1029 04:44:43.324970 5698 http.cpp:321] HTTP GET for >>>>>>> /master/state.json from 114.113.20.135:55682 with >>>>>>> User-Agent='Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_5) >>>>>>> AppleWebKit/537.36 (KHTML, like Gecko) Chrome/46.0.2490.71 >>>>>>> Safari/537.36' >>>>>>> I1029 04:44:46.546671 5703 master.cpp:4613] Sending 1 offers to >>>>>>> framework 20151029-043755-3549436724-5050-5674-0000 (marathon) at >>>>>>> [email protected]:53373 >>>>>>> I1029 04:44:46.557266 5699 master.cpp:2739] Processing ACCEPT call >>>>>>> for offers: [ 20151029-043755-3549436724-5050-5674-O4 ] on slave >>>>>>> 20151029-043755-3549436724-5050-5674-S0 at slave(1)@ >>>>>>> 50.112.136.148:5051 ( >>>>>>> ec2-50-112-136-148.us-west-2.compute.amazonaws.com) for framework >>>>>>> 20151029-043755-3549436724-5050-5674-0000 (marathon) at >>>>>>> [email protected]:53373 >>>>>>> I1029 04:44:46.557394 5699 hierarchical.hpp:814] Recovered >>>>>>> cpus(*):0.9375; mem(*):743; disk(*):3962; ports(*):[31000-31863, >>>>>>> 31865-32000] (total: cpus(*):1; mem(*):999; disk(*):3962; >>>>>>> ports(*):[31000-32000], allocated: cpus(*):0.0625; mem(*):256; >>>>>>> ports(*):[31864-31864]) on slave 20151029-043755-3549436724-5050-5674-S0 >>>>>>> from framework 20151029-043755-3549436724-5050-5674-0000 >>>>>>> I1029 04:44:47.267562 5700 master.cpp:4069] Status update >>>>>>> TASK_FAILED (UUID: 0ea607fc-bf24-4bda-b107-55a54aba31cf) for task >>>>>>> e4a3bed5-64e6-4970-8bb1-df6404656a48.c4239b84-7df7-11e5-b57b-0247b493b22f >>>>>>> of framework 20151029-043755-3549436724-5050-5674-0000 from slave >>>>>>> 20151029-043755-3549436724-5050-5674-S0 at slave(1)@ >>>>>>> 50.112.136.148:5051 ( >>>>>>> ec2-50-112-136-148.us-west-2.compute.amazonaws.com) >>>>>>> I1029 04:44:47.267645 5700 master.cpp:4108] Forwarding status >>>>>>> update TASK_FAILED (UUID: 0ea607fc-bf24-4bda-b107-55a54aba31cf) for task >>>>>>> e4a3bed5-64e6-4970-8bb1-df6404656a48.c4239b84-7df7-11e5-b57b-0247b493b22f >>>>>>> of framework 20151029-043755-3549436724-5050-5674-0000 >>>>>>> I1029 04:44:47.267774 5700 master.cpp:5576] Updating the latest >>>>>>> state of task >>>>>>> e4a3bed5-64e6-4970-8bb1-df6404656a48.c4239b84-7df7-11e5-b57b-0247b493b22f >>>>>>> of framework 20151029-043755-3549436724-5050-5674-0000 to TASK_FAILED >>>>>>> I1029 04:44:47.267907 5700 hierarchical.hpp:814] Recovered >>>>>>> cpus(*):0.0625; mem(*):256; ports(*):[31864-31864] (total: cpus(*):1; >>>>>>> mem(*):999; disk(*):3962; ports(*):[31000-32000], allocated: ) on slave >>>>>>> 20151029-043755-3549436724-5050-5674-S0 from framework >>>>>>> 20151029-043755-3549436724-5050-5674-0000 >>>>>>> I1029 04:44:47.289356 5698 master.cpp:5644] Removing task >>>>>>> e4a3bed5-64e6-4970-8bb1-df6404656a48.c4239b84-7df7-11e5-b57b-0247b493b22f >>>>>>> with resources cpus(*):0.0625; mem(*):256; ports(*):[31864-31864] of >>>>>>> framework 20151029-043755-3549436724-5050-5674-0000 on slave >>>>>>> 20151029-043755-3549436724-5050-5674-S0 at slave(1)@ >>>>>>> 50.112.136.148:5051 ( >>>>>>> ec2-50-112-136-148.us-west-2.compute.amazonaws.com) >>>>>>> I1029 04:44:47.289459 5698 master.cpp:3398] Processing ACKNOWLEDGE >>>>>>> call 0ea607fc-bf24-4bda-b107-55a54aba31cf for task >>>>>>> e4a3bed5-64e6-4970-8bb1-df6404656a48.c4239b84-7df7-11e5-b57b-0247b493b22f >>>>>>> of framework 20151029-043755-3549436724-5050-5674-0000 (marathon) at >>>>>>> [email protected]:53373 >>>>>>> on slave 20151029-043755-3549436724-5050-5674-S0 >>>>>>> >>>>>>> >>>>>>> >>>>>>> 发件人: haosdent <[email protected]> >>>>>>> 答复: "[email protected]" <[email protected]> >>>>>>> 日期: 2015年10月29日 星期四 下午12:02 >>>>>>> 至: user <[email protected]> >>>>>>> 主题: Re: Can't start docker container when SSL_ENABLED is on. >>>>>>> >>>>>>> Oh, I mean you task logs. They could be get from Mesos webui. >>>>>>> >>>>>>> On Thu, Oct 29, 2015 at 11:52 AM, Xiaodong Zhang <[email protected]> >>>>>>> wrote: >>>>>>> >>>>>>>> Thanks for your reply. >>>>>>>> >>>>>>>> Yes I build mesos with `--enable-libevent --enable-ssl`. If I >>>>>>>> don’t provide key and pem when start slave, it will register fail(That >>>>>>>> means the ssl work well right?) >>>>>>>> >>>>>>>> As I said the odd thing is the container nerver run(`docker ps –a >>>>>>>> show nothing`). So it can’t have any stdout or stderr. >>>>>>>> >>>>>>>> 发件人: haosdent <[email protected]> >>>>>>>> 答复: "[email protected]" <[email protected]> >>>>>>>> 日期: 2015年10月29日 星期四 上午11:47 >>>>>>>> 至: user <[email protected]> >>>>>>>> 主题: Re: Can't start docker container when SSL_ENABLED is on. >>>>>>>> >>>>>>>> Do you compile mesos with ssl support? The default compile don't >>>>>>>> contains ssl. And does docker container have stdour and stderr? >>>>>>>> >>>>>>>> On Thu, Oct 29, 2015 at 11:41 AM, Xiaodong Zhang <[email protected] >>>>>>>> > wrote: >>>>>>>> >>>>>>>>> My scenarios is like previous email says, masters and slaves are >>>>>>>>> in different IaaS. Now the slaves can register to the masters >>>>>>>>> with SSL_ENABLED is on . >>>>>>>>> >>>>>>>>> But I meet another problem. Slaves can’t run container(the odd >>>>>>>>> thing is they can pull image successfully,just can not run container, >>>>>>>>> `docker ps –a ` list nothing) >>>>>>>>> >>>>>>>>> The logs like this: >>>>>>>>> >>>>>>>>> I1029 03:29:45.967741 9288 docker.cpp:758] Starting container >>>>>>>>> 'd4f4e236-0d0a-492c-86df-eef48a414e23' for task >>>>>>>>> '279bcb34-f705-4857-96ad-d96843b848fb.4b3abdcd-7ded-11e5-a82d-0240afabf713' >>>>>>>>> (and executor >>>>>>>>> '279bcb34-f705-4857-96ad-d96843b848fb.4b3abdcd-7ded-11e5-a82d-0240afabf713') >>>>>>>>> of framework '20151029-031549-1294671788-5050-4937-0000' >>>>>>>>> I1029 03:29:48.044148 9292 docker.cpp:382] Checkpointing pid >>>>>>>>> 12062 to >>>>>>>>> '/tmp/mesos/meta/slaves/20151029-031549-1294671788-5050-4937-S0/frameworks/20151029-031549-1294671788-5050-4937-0000/executors/279bcb34-f705-4857-96ad-d96843b848fb.4b3abdcd-7ded-11e5-a82d-0240afabf713/runs/d4f4e236-0d0a-492c-86df-eef48a414e23/pids/forked.pid' >>>>>>>>> I1029 03:29:53.159361 9292 docker.cpp:1576] Executor for >>>>>>>>> container 'd4f4e236-0d0a-492c-86df-eef48a414e23' has exited >>>>>>>>> I1029 03:29:53.159572 9292 docker.cpp:1374] Destroying container >>>>>>>>> 'd4f4e236-0d0a-492c-86df-eef48a414e23' >>>>>>>>> I1029 03:29:53.159822 9292 docker.cpp:1478] Running docker stop >>>>>>>>> on container 'd4f4e236-0d0a-492c-86df-eef48a414e23' >>>>>>>>> I1029 03:29:53.160143 9292 slave.cpp:3399] Executor >>>>>>>>> '279bcb34-f705-4857-96ad-d96843b848fb.4b3abdcd-7ded-11e5-a82d-0240afabf713' >>>>>>>>> of framework 20151029-031549-1294671788-5050-4937-0000 terminated >>>>>>>>> with signal Killed >>>>>>>>> I1029 03:29:53.160884 9292 slave.cpp:2696] Handling status update >>>>>>>>> TASK_FAILED (UUID: 27a2080a-8807-449e-9077-837ec45b4c51) for task >>>>>>>>> 279bcb34-f705-4857-96ad-d96843b848fb.4b3abdcd-7ded-11e5-a82d-0240afabf713 >>>>>>>>> of framework 20151029-031549-1294671788-5050-4937-0000 from @ >>>>>>>>> 0.0.0.0:0 >>>>>>>>> W1029 03:29:53.161247 9288 docker.cpp:986] Ignoring updating >>>>>>>>> unknown container: d4f4e236-0d0a-492c-86df-eef48a414e23 >>>>>>>>> I1029 03:29:53.161548 9293 status_update_manager.cpp:322] >>>>>>>>> Received status update TASK_FAILED (UUID: >>>>>>>>> 27a2080a-8807-449e-9077-837ec45b4c51) for task >>>>>>>>> 279bcb34-f705-4857-96ad-d96843b848fb.4b3abdcd-7ded-11e5-a82d-0240afabf713 >>>>>>>>> of framework 20151029-031549-1294671788-5050-4937-0000 >>>>>>>>> >>>>>>>>> I run master node with env: >>>>>>>>> >>>>>>>>> SSL_SUPPORT_DOWNGRADE=true >>>>>>>>> SSL_ENABLED=true >>>>>>>>> SSL_KEY_FILE=/home/ubuntu/xx.key >>>>>>>>> SSL_CERT_FILE=/home/ubuntu/xx.pem >>>>>>>>> >>>>>>>>> Slave node with env: >>>>>>>>> >>>>>>>>> SSL_ENABLED=true >>>>>>>>> SSL_KEY_FILE=/home/ubuntu/xx.key >>>>>>>>> SSL_CERT_FILE=/home/ubuntu/xx.pem >>>>>>>>> LIBPROCESS_ADVERTISE_IP=xxx.xxx.xxx.xxx >>>>>>>>> >>>>>>>>> When I remove all SSL envs. Slaves work well. >>>>>>>>> >>>>>>>>> Did I miss sth? >>>>>>>>> >>>>>>>>> Version: >>>>>>>>> >>>>>>>>> Mesos 0.24.1 >>>>>>>>> Maraton 0.9.2 >>>>>>>>> >>>>>>>>> OS >>>>>>>>> ubuntu 14.04 >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> 发件人: Anindya Sinha <[email protected]> >>>>>>>>> 答复: "[email protected]" <[email protected]> >>>>>>>>> 日期: 2015年10月28日 星期三 下午2:32 >>>>>>>>> 至: "[email protected]" <[email protected]> >>>>>>>>> 主题: Re: How to tell master which ip to connect. >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> On Tue, Oct 27, 2015 at 7:43 PM, Xiaodong Zhang <[email protected] >>>>>>>>> > wrote: >>>>>>>>> >>>>>>>>>> It works! Thanks a lot. >>>>>>>>>> >>>>>>>>> >>>>>>>>> Ok. So we should expose advertise_ip and advertise_port as command >>>>>>>>> line options for mesos-slave as well (instead of using the environment >>>>>>>>> variables)? Opened >>>>>>>>> https://issues.apache.org/jira/browse/MESOS-3809. >>>>>>>>> >>>>>>>>> >>>>>>>>>> >>>>>>>>>> Another question. Do masters and slaves communicate each other >>>>>>>>>> via a safety way?Is the data encrypted? I want to make sure deploy >>>>>>>>>> masters >>>>>>>>>> and slaves into different IaaS is PROD-READY. >>>>>>>>>> >>>>>>>>>> 发件人: haosdent <[email protected]> >>>>>>>>>> 答复: "[email protected]" <[email protected]> >>>>>>>>>> 日期: 2015年10月28日 星期三 上午10:23 >>>>>>>>>> 至: user <[email protected]> >>>>>>>>>> 主题: Re: How to tell master which ip to connect. >>>>>>>>>> >>>>>>>>>> Do you try `export LIBPROCESS_ADVERTISE_IP=xxx` and >>>>>>>>>> `LIBPROCESS_ADVERTISE_PORT` when start slave? >>>>>>>>>> >>>>>>>>>> On Wed, Oct 28, 2015 at 10:16 AM, Xiaodong Zhang < >>>>>>>>>> [email protected]> wrote: >>>>>>>>>> >>>>>>>>>>> Hi teams: >>>>>>>>>>> >>>>>>>>>>> My scenarios is like this: >>>>>>>>>>> >>>>>>>>>>> My master nodes were deployed in AWS. My slaves were in AZURE.So >>>>>>>>>>> they communicate via public ip. >>>>>>>>>>> I got trouble when slaves try to register to master. >>>>>>>>>>> Now slaves can get master’s public ip address,and can send >>>>>>>>>>> register request.But they can only send there private ip to >>>>>>>>>>> master.(Because >>>>>>>>>>> they don’t know there public ip,thus they can’t not bind a public >>>>>>>>>>> ip via >>>>>>>>>>> —ip flag), thus masters can’t connect slaves.How can the slave to >>>>>>>>>>> tell >>>>>>>>>>> master which ip master should connect(I can’t find any flags like >>>>>>>>>>> —advertise_ip >>>>>>>>>>> in master). >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> Best Regards, >>>>>>>>>> Haosdent Huang >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> Best Regards, >>>>>>>> Haosdent Huang >>>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Best Regards, >>>>>>> Haosdent Huang >>>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Best Regards, >>>>>> Haosdent Huang >>>>>> >>>>> >>>>> >>>> >>>> >>>> -- >>>> Best Regards, >>>> Haosdent Huang >>>> <5185_02_07.png><9D46724C-457C-4BE1-B0E4-F57B147F6DC8.png> >>>> <742629F2-78E8-43F2-9015-F3D22720826B.png><5185_02_04.png> >>>> >>>> >>>> >>>> >>> >> >> >> -- >> Best Regards, >> Haosdent Huang >> > > > > -- > Best Regards, > Haosdent Huang > -- Best Regards, Haosdent Huang

