Re: Help needed (alas, urgently)
Hi Paul, No problem, I haven't even spent much time yet and glad you resolved the problem yourself. We always welcome people doing this :) Tim On Fri, Jan 15, 2016 at 10:48 AM, Paul Bell wrote: > Tim, > > I've tracked down the cause of this problem: it's the result of some kind > of incompatibility between kernel 3.19 and "VMware Tools". I know little > more than that. > > I installed VMware Tools via *apt-get install open-vm-tools-lts-trusty*. > Everything worked fine on 3.13. But when I upgrade to 3.19, the error > occurs quite reliably. Revert back to 3.13 and the error goes away. > > I looked high & low for some statement of kernel requirements for VMware > Tools, but can find none. > > Sorry to have wasted your time. > > -Paul > > On Fri, Jan 15, 2016 at 9:19 AM, Paul Bell wrote: > >> In chasing down this problem, I stumbled upon something of moment: the >> problem does NOT seem to happen with kernel 3.13. >> >> Some weeks back, in the hope of getting past another problem wherein the >> root filesystem "becomes" R/O, I upgraded from 3.13 to 3.19 (Ubuntu 14.04 >> LTS). The kernel upgrade was done as shown here (there's some extra stuff >> to get rid of Ubuntu desktop and liberate some disk space): >> >> apt-get update >> apt-get -y remove unbuntu-desktop >> apt-get -y purge lightdm >> rm -Rf /var/lib/lightdm-data >> apt-get -y remove --purge libreoffice-core >> apt-get -y remove --purge libreoffice-common >> >> echo " Installing new kernel" >> >> apt-get -y install linux-generic-lts-vivid >> apt-get -y autoremove linux-image-3.13.0-32-generic >> apt-get -y autoremove linux-image-3.13.0-71-generic >> update-grub >> reboot >> >> After the reboot, a "uname -r" shows kernel 3.19.0-42-generic. >> >> Under this kernel I can now reliably reproduce the failure to stop a >> MongoDB container. Specifically, any & all attempts to kill the container, >> e.g.,via >> >> Marathon HTTP Delete (which leads to docker-mesos-executor presenting >> "docker stop" command) >> Getting inside the running container shell and issuing "kill" or >> db.shutDown() >> >> causes the mongod container >> >>- to show in its log that it's shutting down normally >>- to enter a 100% CPU loop >>- to become unkillable (only reboot "fixes" things) >> >> Note finally that my conclusion about kernel 3.13 "working" is at present >> a weak induction. But I do know that when I reverted to that kernel I >> could, at least once, stop the containers w/o any problems; whereas at 3.19 >> I can reliably reproduce the problem. I will try to make this induction >> stronger as the day wears on. >> >> Did I do something "wrong" in my kernel upgrade steps? >> >> Is anyone aware of such an issue in 3.19 or of work done post-3.13 in the >> area of task termination & signal handling? >> >> Thanks for your help. >> >> -Paul >> >> >> On Thu, Jan 14, 2016 at 5:14 PM, Paul Bell wrote: >> >>> I spoke to soon, I'm afraid. >>> >>> Next time I did the stop (with zero timeout), I see the same phenomenon: >>> a mongo container showing repeated: >>> >>> killing docker task >>> shutting down >>> >>> >>> What else can I try? >>> >>> Thank you. >>> >>> On Thu, Jan 14, 2016 at 5:07 PM, Paul Bell wrote: >>> Hi Tim, I set docker_stop_timeout to zero as you asked. I am pleased to report (though a bit fearful about being pleased) that this change seems to have shut everyone down pretty much instantly. Can you explain what's happening, e.g., does docker_stop_timeout=0 cause the immediate use of "kill -9" as opposed to "kill -2"? I will keep testing the behavior. Thank you. -Paul On Thu, Jan 14, 2016 at 3:59 PM, Paul Bell wrote: > Hi Tim, > > Things have gotten slightly odder (if that's possible). When I now > start the application 5 or so containers, only one "ecxconfigdb" gets > started - and even he took a few tries. That is, I see him failing, moving > to deploying, then starting again. But I've no evidence (no STDOUT, and no > docker ctr logs) that show why. > > In any event, ecxconfigdb does start. Happily, when I try to stop the > application I am seeing the phenomena I posted before: killing docker > task, > shutting down repeated many times. The UN-stopped container is now running > at 100% CPU. > > I will try modifying docker_stop_timeout. Back shortly > > Thanks again. > > -Paul > > PS: what do you make of the "broken pipe" error in the docker.log? > > *from /var/log/upstart/docker.log* > > [34mINFO [0m[3054] GET /v1.15/images/mongo:2.6.8/json > [34mINFO [0m[3054] GET > /v1.21/images/mesos-20160114-153418-1674208327-5050-3798-S0.c5c35d59-1318-4a96-b850-b0b788815f1b/json > [31mERRO [0m[3054] Handler for GET > /v1.21/images/mesos-20160114-153418-1674208327-5050-3798-S0.c5c35d59-1318-4a96-b850-b0b788815f1b/json > returned erro
Re: Help needed (alas, urgently)
Tim, I've tracked down the cause of this problem: it's the result of some kind of incompatibility between kernel 3.19 and "VMware Tools". I know little more than that. I installed VMware Tools via *apt-get install open-vm-tools-lts-trusty*. Everything worked fine on 3.13. But when I upgrade to 3.19, the error occurs quite reliably. Revert back to 3.13 and the error goes away. I looked high & low for some statement of kernel requirements for VMware Tools, but can find none. Sorry to have wasted your time. -Paul On Fri, Jan 15, 2016 at 9:19 AM, Paul Bell wrote: > In chasing down this problem, I stumbled upon something of moment: the > problem does NOT seem to happen with kernel 3.13. > > Some weeks back, in the hope of getting past another problem wherein the > root filesystem "becomes" R/O, I upgraded from 3.13 to 3.19 (Ubuntu 14.04 > LTS). The kernel upgrade was done as shown here (there's some extra stuff > to get rid of Ubuntu desktop and liberate some disk space): > > apt-get update > apt-get -y remove unbuntu-desktop > apt-get -y purge lightdm > rm -Rf /var/lib/lightdm-data > apt-get -y remove --purge libreoffice-core > apt-get -y remove --purge libreoffice-common > > echo " Installing new kernel" > > apt-get -y install linux-generic-lts-vivid > apt-get -y autoremove linux-image-3.13.0-32-generic > apt-get -y autoremove linux-image-3.13.0-71-generic > update-grub > reboot > > After the reboot, a "uname -r" shows kernel 3.19.0-42-generic. > > Under this kernel I can now reliably reproduce the failure to stop a > MongoDB container. Specifically, any & all attempts to kill the container, > e.g.,via > > Marathon HTTP Delete (which leads to docker-mesos-executor presenting > "docker stop" command) > Getting inside the running container shell and issuing "kill" or > db.shutDown() > > causes the mongod container > >- to show in its log that it's shutting down normally >- to enter a 100% CPU loop >- to become unkillable (only reboot "fixes" things) > > Note finally that my conclusion about kernel 3.13 "working" is at present > a weak induction. But I do know that when I reverted to that kernel I > could, at least once, stop the containers w/o any problems; whereas at 3.19 > I can reliably reproduce the problem. I will try to make this induction > stronger as the day wears on. > > Did I do something "wrong" in my kernel upgrade steps? > > Is anyone aware of such an issue in 3.19 or of work done post-3.13 in the > area of task termination & signal handling? > > Thanks for your help. > > -Paul > > > On Thu, Jan 14, 2016 at 5:14 PM, Paul Bell wrote: > >> I spoke to soon, I'm afraid. >> >> Next time I did the stop (with zero timeout), I see the same phenomenon: >> a mongo container showing repeated: >> >> killing docker task >> shutting down >> >> >> What else can I try? >> >> Thank you. >> >> On Thu, Jan 14, 2016 at 5:07 PM, Paul Bell wrote: >> >>> Hi Tim, >>> >>> I set docker_stop_timeout to zero as you asked. I am pleased to report >>> (though a bit fearful about being pleased) that this change seems to have >>> shut everyone down pretty much instantly. >>> >>> Can you explain what's happening, e.g., does docker_stop_timeout=0 >>> cause the immediate use of "kill -9" as opposed to "kill -2"? >>> >>> I will keep testing the behavior. >>> >>> Thank you. >>> >>> -Paul >>> >>> On Thu, Jan 14, 2016 at 3:59 PM, Paul Bell wrote: >>> Hi Tim, Things have gotten slightly odder (if that's possible). When I now start the application 5 or so containers, only one "ecxconfigdb" gets started - and even he took a few tries. That is, I see him failing, moving to deploying, then starting again. But I've no evidence (no STDOUT, and no docker ctr logs) that show why. In any event, ecxconfigdb does start. Happily, when I try to stop the application I am seeing the phenomena I posted before: killing docker task, shutting down repeated many times. The UN-stopped container is now running at 100% CPU. I will try modifying docker_stop_timeout. Back shortly Thanks again. -Paul PS: what do you make of the "broken pipe" error in the docker.log? *from /var/log/upstart/docker.log* [34mINFO [0m[3054] GET /v1.15/images/mongo:2.6.8/json [34mINFO [0m[3054] GET /v1.21/images/mesos-20160114-153418-1674208327-5050-3798-S0.c5c35d59-1318-4a96-b850-b0b788815f1b/json [31mERRO [0m[3054] Handler for GET /v1.21/images/mesos-20160114-153418-1674208327-5050-3798-S0.c5c35d59-1318-4a96-b850-b0b788815f1b/json returned error: No such image: mesos-20160114-153418-1674208327-5050-3798-S0.c5c35d59-1318-4a96-b850-b0b788815f1b [31mERRO [0m[3054] HTTP Error [31merr [0m=No such image: mesos-20160114-153418-1674208327-5050-3798-S0.c5c35d59-1318-4a96-b850-b0b788815f1b [31mstatusCode [0m=404 [34mINFO [0m[3054] GET /v1.15/containers/weave/json [34mIN
Re: Help needed (alas, urgently)
In chasing down this problem, I stumbled upon something of moment: the problem does NOT seem to happen with kernel 3.13. Some weeks back, in the hope of getting past another problem wherein the root filesystem "becomes" R/O, I upgraded from 3.13 to 3.19 (Ubuntu 14.04 LTS). The kernel upgrade was done as shown here (there's some extra stuff to get rid of Ubuntu desktop and liberate some disk space): apt-get update apt-get -y remove unbuntu-desktop apt-get -y purge lightdm rm -Rf /var/lib/lightdm-data apt-get -y remove --purge libreoffice-core apt-get -y remove --purge libreoffice-common echo " Installing new kernel" apt-get -y install linux-generic-lts-vivid apt-get -y autoremove linux-image-3.13.0-32-generic apt-get -y autoremove linux-image-3.13.0-71-generic update-grub reboot After the reboot, a "uname -r" shows kernel 3.19.0-42-generic. Under this kernel I can now reliably reproduce the failure to stop a MongoDB container. Specifically, any & all attempts to kill the container, e.g.,via Marathon HTTP Delete (which leads to docker-mesos-executor presenting "docker stop" command) Getting inside the running container shell and issuing "kill" or db.shutDown() causes the mongod container - to show in its log that it's shutting down normally - to enter a 100% CPU loop - to become unkillable (only reboot "fixes" things) Note finally that my conclusion about kernel 3.13 "working" is at present a weak induction. But I do know that when I reverted to that kernel I could, at least once, stop the containers w/o any problems; whereas at 3.19 I can reliably reproduce the problem. I will try to make this induction stronger as the day wears on. Did I do something "wrong" in my kernel upgrade steps? Is anyone aware of such an issue in 3.19 or of work done post-3.13 in the area of task termination & signal handling? Thanks for your help. -Paul On Thu, Jan 14, 2016 at 5:14 PM, Paul Bell wrote: > I spoke to soon, I'm afraid. > > Next time I did the stop (with zero timeout), I see the same phenomenon: a > mongo container showing repeated: > > killing docker task > shutting down > > > What else can I try? > > Thank you. > > On Thu, Jan 14, 2016 at 5:07 PM, Paul Bell wrote: > >> Hi Tim, >> >> I set docker_stop_timeout to zero as you asked. I am pleased to report >> (though a bit fearful about being pleased) that this change seems to have >> shut everyone down pretty much instantly. >> >> Can you explain what's happening, e.g., does docker_stop_timeout=0 cause >> the immediate use of "kill -9" as opposed to "kill -2"? >> >> I will keep testing the behavior. >> >> Thank you. >> >> -Paul >> >> On Thu, Jan 14, 2016 at 3:59 PM, Paul Bell wrote: >> >>> Hi Tim, >>> >>> Things have gotten slightly odder (if that's possible). When I now start >>> the application 5 or so containers, only one "ecxconfigdb" gets started - >>> and even he took a few tries. That is, I see him failing, moving to >>> deploying, then starting again. But I've no evidence (no STDOUT, and no >>> docker ctr logs) that show why. >>> >>> In any event, ecxconfigdb does start. Happily, when I try to stop the >>> application I am seeing the phenomena I posted before: killing docker task, >>> shutting down repeated many times. The UN-stopped container is now running >>> at 100% CPU. >>> >>> I will try modifying docker_stop_timeout. Back shortly >>> >>> Thanks again. >>> >>> -Paul >>> >>> PS: what do you make of the "broken pipe" error in the docker.log? >>> >>> *from /var/log/upstart/docker.log* >>> >>> [34mINFO [0m[3054] GET /v1.15/images/mongo:2.6.8/json >>> [34mINFO [0m[3054] GET >>> /v1.21/images/mesos-20160114-153418-1674208327-5050-3798-S0.c5c35d59-1318-4a96-b850-b0b788815f1b/json >>> [31mERRO [0m[3054] Handler for GET >>> /v1.21/images/mesos-20160114-153418-1674208327-5050-3798-S0.c5c35d59-1318-4a96-b850-b0b788815f1b/json >>> returned error: No such image: >>> mesos-20160114-153418-1674208327-5050-3798-S0.c5c35d59-1318-4a96-b850-b0b788815f1b >>> [31mERRO [0m[3054] HTTP Error >>> [31merr [0m=No such image: >>> mesos-20160114-153418-1674208327-5050-3798-S0.c5c35d59-1318-4a96-b850-b0b788815f1b >>> [31mstatusCode [0m=404 >>> [34mINFO [0m[3054] GET /v1.15/containers/weave/json >>> [34mINFO [0m[3054] POST >>> /v1.21/containers/create?name=mesos-20160114-153418-1674208327-5050-3798-S0.c5c35d59-1318-4a96-b850-b0b788815f1b >>> [34mINFO [0m[3054] POST >>> /v1.21/containers/cf7fc7c483248e30f1dbb5990ce8874f2bfbe936c74eed1fc9af6f70653a1d47/attach?stderr=1&stdout=1&stream=1 >>> [34mINFO [0m[3054] POST >>> /v1.21/containers/cf7fc7c483248e30f1dbb5990ce8874f2bfbe936c74eed1fc9af6f70653a1d47/start >>> [34mINFO [0m[3054] GET >>> /v1.15/containers/cf7fc7c483248e30f1dbb5990ce8874f2bfbe936c74eed1fc9af6f70653a1d47/json >>> [34mINFO [0m[3054] GET >>> /v1.15/containers/cf7fc7c483248e30f1dbb5990ce8874f2bfbe936c74eed1fc9af6f70653a1d47/json >>> [34mINFO [0m[3054] GET /v1.15/containers/weave/json >>> [34mINFO [0m[3054] GET >>> /v1.15/cont
Re: Help needed (alas, urgently)
I spoke to soon, I'm afraid. Next time I did the stop (with zero timeout), I see the same phenomenon: a mongo container showing repeated: killing docker task shutting down What else can I try? Thank you. On Thu, Jan 14, 2016 at 5:07 PM, Paul Bell wrote: > Hi Tim, > > I set docker_stop_timeout to zero as you asked. I am pleased to report > (though a bit fearful about being pleased) that this change seems to have > shut everyone down pretty much instantly. > > Can you explain what's happening, e.g., does docker_stop_timeout=0 cause > the immediate use of "kill -9" as opposed to "kill -2"? > > I will keep testing the behavior. > > Thank you. > > -Paul > > On Thu, Jan 14, 2016 at 3:59 PM, Paul Bell wrote: > >> Hi Tim, >> >> Things have gotten slightly odder (if that's possible). When I now start >> the application 5 or so containers, only one "ecxconfigdb" gets started - >> and even he took a few tries. That is, I see him failing, moving to >> deploying, then starting again. But I've no evidence (no STDOUT, and no >> docker ctr logs) that show why. >> >> In any event, ecxconfigdb does start. Happily, when I try to stop the >> application I am seeing the phenomena I posted before: killing docker task, >> shutting down repeated many times. The UN-stopped container is now running >> at 100% CPU. >> >> I will try modifying docker_stop_timeout. Back shortly >> >> Thanks again. >> >> -Paul >> >> PS: what do you make of the "broken pipe" error in the docker.log? >> >> *from /var/log/upstart/docker.log* >> >> [34mINFO [0m[3054] GET /v1.15/images/mongo:2.6.8/json >> [34mINFO [0m[3054] GET >> /v1.21/images/mesos-20160114-153418-1674208327-5050-3798-S0.c5c35d59-1318-4a96-b850-b0b788815f1b/json >> [31mERRO [0m[3054] Handler for GET >> /v1.21/images/mesos-20160114-153418-1674208327-5050-3798-S0.c5c35d59-1318-4a96-b850-b0b788815f1b/json >> returned error: No such image: >> mesos-20160114-153418-1674208327-5050-3798-S0.c5c35d59-1318-4a96-b850-b0b788815f1b >> [31mERRO [0m[3054] HTTP Error [31merr >> [0m=No such image: >> mesos-20160114-153418-1674208327-5050-3798-S0.c5c35d59-1318-4a96-b850-b0b788815f1b >> [31mstatusCode [0m=404 >> [34mINFO [0m[3054] GET /v1.15/containers/weave/json >> [34mINFO [0m[3054] POST >> /v1.21/containers/create?name=mesos-20160114-153418-1674208327-5050-3798-S0.c5c35d59-1318-4a96-b850-b0b788815f1b >> [34mINFO [0m[3054] POST >> /v1.21/containers/cf7fc7c483248e30f1dbb5990ce8874f2bfbe936c74eed1fc9af6f70653a1d47/attach?stderr=1&stdout=1&stream=1 >> [34mINFO [0m[3054] POST >> /v1.21/containers/cf7fc7c483248e30f1dbb5990ce8874f2bfbe936c74eed1fc9af6f70653a1d47/start >> [34mINFO [0m[3054] GET >> /v1.15/containers/cf7fc7c483248e30f1dbb5990ce8874f2bfbe936c74eed1fc9af6f70653a1d47/json >> [34mINFO [0m[3054] GET >> /v1.15/containers/cf7fc7c483248e30f1dbb5990ce8874f2bfbe936c74eed1fc9af6f70653a1d47/json >> [34mINFO [0m[3054] GET /v1.15/containers/weave/json >> [34mINFO [0m[3054] GET >> /v1.15/containers/cf7fc7c483248e30f1dbb5990ce8874f2bfbe936c74eed1fc9af6f70653a1d47/json >> [34mINFO [0m[3054] GET >> /v1.15/containers/cf7fc7c483248e30f1dbb5990ce8874f2bfbe936c74eed1fc9af6f70653a1d47/json >> [34mINFO [0m[3054] GET /v1.15/containers/weave/json >> [34mINFO [0m[3054] GET >> /v1.15/containers/cf7fc7c483248e30f1dbb5990ce8874f2bfbe936c74eed1fc9af6f70653a1d47/json >> [34mINFO [0m[3054] GET >> /v1.21/containers/mesos-20160114-153418-1674208327-5050-3798-S0.c5c35d59-1318-4a96-b850-b0b788815f1b/json >> [34mINFO [0m[3111] GET /v1.21/containers/json >> [34mINFO [0m[3120] GET /v1.21/containers/cf7/json >> [34mINFO [0m[3120] GET >> /v1.21/containers/cf7/logs?stderr=1&stdout=1&tail=all >> [34mINFO [0m[3153] GET /containers/json >> [34mINFO [0m[3153] GET >> /containers/cf7fc7c483248e30f1dbb5990ce8874f2bfbe936c74eed1fc9af6f70653a1d47/json >> [34mINFO [0m[3153] GET >> /containers/56111722ef83134f6c73c5e3aa27de3f34f1fa73efdec3257c3cc9b283e40729/json >> [34mINFO [0m[3153] GET >> /containers/b9e9b79a8d431455bfcaafca59223017b2470a47a294075d656eeffdaaefad33/json >> [34mINFO [0m[3175] GET /containers/json >> [34mINFO [0m[3175] GET >> /containers/cf7fc7c483248e30f1dbb5990ce8874f2bfbe936c74eed1fc9af6f70653a1d47/json >> [34mINFO [0m[3175] GET >> /containers/56111722ef83134f6c73c5e3aa27de3f34f1fa73efdec3257c3cc9b283e40729/json >> [34mINFO [0m[3175] GET >> /containers/b9e9b79a8d431455bfcaafca59223017b2470a47a294075d656eeffdaaefad33/json >> * [34mINFO [0m[3175] POST >> /v1.21/containers/mesos-20160114-153418-1674208327-5050-3798-S0.c5c35d59-1318-4a96-b850-b0b788815f1b/stop* >> ?t=15 >> * [31mERRO [0m[3175] attach: stdout: write unix @: broken pipe* >> * [34mINFO [0m[3190] Container >> cf7fc7c483248e30f1dbb5990ce8874f2bfbe936c74eed1fc9af6f70653a1d47 failed to >> exit within 15 seconds of SIGTERM - using the force * >> * [34mINFO [0m[3200] Container cf7fc7c48324 failed to exit within 10 >> seconds of kill - trying direct SIGKILL * >> >> *STDOUT from Mesos:* >> >> *--container="mesos-20160114-153418-167
Re: Help needed (alas, urgently)
Hi Tim, I set docker_stop_timeout to zero as you asked. I am pleased to report (though a bit fearful about being pleased) that this change seems to have shut everyone down pretty much instantly. Can you explain what's happening, e.g., does docker_stop_timeout=0 cause the immediate use of "kill -9" as opposed to "kill -2"? I will keep testing the behavior. Thank you. -Paul On Thu, Jan 14, 2016 at 3:59 PM, Paul Bell wrote: > Hi Tim, > > Things have gotten slightly odder (if that's possible). When I now start > the application 5 or so containers, only one "ecxconfigdb" gets started - > and even he took a few tries. That is, I see him failing, moving to > deploying, then starting again. But I've no evidence (no STDOUT, and no > docker ctr logs) that show why. > > In any event, ecxconfigdb does start. Happily, when I try to stop the > application I am seeing the phenomena I posted before: killing docker task, > shutting down repeated many times. The UN-stopped container is now running > at 100% CPU. > > I will try modifying docker_stop_timeout. Back shortly > > Thanks again. > > -Paul > > PS: what do you make of the "broken pipe" error in the docker.log? > > *from /var/log/upstart/docker.log* > > [34mINFO [0m[3054] GET /v1.15/images/mongo:2.6.8/json > [34mINFO [0m[3054] GET > /v1.21/images/mesos-20160114-153418-1674208327-5050-3798-S0.c5c35d59-1318-4a96-b850-b0b788815f1b/json > [31mERRO [0m[3054] Handler for GET > /v1.21/images/mesos-20160114-153418-1674208327-5050-3798-S0.c5c35d59-1318-4a96-b850-b0b788815f1b/json > returned error: No such image: > mesos-20160114-153418-1674208327-5050-3798-S0.c5c35d59-1318-4a96-b850-b0b788815f1b > [31mERRO [0m[3054] HTTP Error [31merr > [0m=No such image: > mesos-20160114-153418-1674208327-5050-3798-S0.c5c35d59-1318-4a96-b850-b0b788815f1b > [31mstatusCode [0m=404 > [34mINFO [0m[3054] GET /v1.15/containers/weave/json > [34mINFO [0m[3054] POST > /v1.21/containers/create?name=mesos-20160114-153418-1674208327-5050-3798-S0.c5c35d59-1318-4a96-b850-b0b788815f1b > [34mINFO [0m[3054] POST > /v1.21/containers/cf7fc7c483248e30f1dbb5990ce8874f2bfbe936c74eed1fc9af6f70653a1d47/attach?stderr=1&stdout=1&stream=1 > [34mINFO [0m[3054] POST > /v1.21/containers/cf7fc7c483248e30f1dbb5990ce8874f2bfbe936c74eed1fc9af6f70653a1d47/start > [34mINFO [0m[3054] GET > /v1.15/containers/cf7fc7c483248e30f1dbb5990ce8874f2bfbe936c74eed1fc9af6f70653a1d47/json > [34mINFO [0m[3054] GET > /v1.15/containers/cf7fc7c483248e30f1dbb5990ce8874f2bfbe936c74eed1fc9af6f70653a1d47/json > [34mINFO [0m[3054] GET /v1.15/containers/weave/json > [34mINFO [0m[3054] GET > /v1.15/containers/cf7fc7c483248e30f1dbb5990ce8874f2bfbe936c74eed1fc9af6f70653a1d47/json > [34mINFO [0m[3054] GET > /v1.15/containers/cf7fc7c483248e30f1dbb5990ce8874f2bfbe936c74eed1fc9af6f70653a1d47/json > [34mINFO [0m[3054] GET /v1.15/containers/weave/json > [34mINFO [0m[3054] GET > /v1.15/containers/cf7fc7c483248e30f1dbb5990ce8874f2bfbe936c74eed1fc9af6f70653a1d47/json > [34mINFO [0m[3054] GET > /v1.21/containers/mesos-20160114-153418-1674208327-5050-3798-S0.c5c35d59-1318-4a96-b850-b0b788815f1b/json > [34mINFO [0m[3111] GET /v1.21/containers/json > [34mINFO [0m[3120] GET /v1.21/containers/cf7/json > [34mINFO [0m[3120] GET > /v1.21/containers/cf7/logs?stderr=1&stdout=1&tail=all > [34mINFO [0m[3153] GET /containers/json > [34mINFO [0m[3153] GET > /containers/cf7fc7c483248e30f1dbb5990ce8874f2bfbe936c74eed1fc9af6f70653a1d47/json > [34mINFO [0m[3153] GET > /containers/56111722ef83134f6c73c5e3aa27de3f34f1fa73efdec3257c3cc9b283e40729/json > [34mINFO [0m[3153] GET > /containers/b9e9b79a8d431455bfcaafca59223017b2470a47a294075d656eeffdaaefad33/json > [34mINFO [0m[3175] GET /containers/json > [34mINFO [0m[3175] GET > /containers/cf7fc7c483248e30f1dbb5990ce8874f2bfbe936c74eed1fc9af6f70653a1d47/json > [34mINFO [0m[3175] GET > /containers/56111722ef83134f6c73c5e3aa27de3f34f1fa73efdec3257c3cc9b283e40729/json > [34mINFO [0m[3175] GET > /containers/b9e9b79a8d431455bfcaafca59223017b2470a47a294075d656eeffdaaefad33/json > * [34mINFO [0m[3175] POST > /v1.21/containers/mesos-20160114-153418-1674208327-5050-3798-S0.c5c35d59-1318-4a96-b850-b0b788815f1b/stop* > ?t=15 > * [31mERRO [0m[3175] attach: stdout: write unix @: broken pipe* > * [34mINFO [0m[3190] Container > cf7fc7c483248e30f1dbb5990ce8874f2bfbe936c74eed1fc9af6f70653a1d47 failed to > exit within 15 seconds of SIGTERM - using the force * > * [34mINFO [0m[3200] Container cf7fc7c48324 failed to exit within 10 > seconds of kill - trying direct SIGKILL * > > *STDOUT from Mesos:* > > *--container="mesos-20160114-153418-1674208327-5050-3798-S0.c5c35d59-1318-4a96-b850-b0b788815f1b" > *--docker="/usr/local/ecxmcc/weaveShim" --help="false" > --initialize_driver_logging="true" --logbufsecs="0" --logging_level="INFO" > --mapped_directory="/mnt/mesos/sandbox" --quiet="false" > --sandbox_directory="/tmp/mesos/slaves/20160114-153418-1674208327-5050-3798-S0/frameworks/20160114-103414-1674208327-5050-
Re: Help needed (alas, urgently)
Hi Tim, Things have gotten slightly odder (if that's possible). When I now start the application 5 or so containers, only one "ecxconfigdb" gets started - and even he took a few tries. That is, I see him failing, moving to deploying, then starting again. But I've no evidence (no STDOUT, and no docker ctr logs) that show why. In any event, ecxconfigdb does start. Happily, when I try to stop the application I am seeing the phenomena I posted before: killing docker task, shutting down repeated many times. The UN-stopped container is now running at 100% CPU. I will try modifying docker_stop_timeout. Back shortly Thanks again. -Paul PS: what do you make of the "broken pipe" error in the docker.log? *from /var/log/upstart/docker.log* [34mINFO[0m[3054] GET /v1.15/images/mongo:2.6.8/json [34mINFO[0m[3054] GET /v1.21/images/mesos-20160114-153418-1674208327-5050-3798-S0.c5c35d59-1318-4a96-b850-b0b788815f1b/json [31mERRO[0m[3054] Handler for GET /v1.21/images/mesos-20160114-153418-1674208327-5050-3798-S0.c5c35d59-1318-4a96-b850-b0b788815f1b/json returned error: No such image: mesos-20160114-153418-1674208327-5050-3798-S0.c5c35d59-1318-4a96-b850-b0b788815f1b [31mERRO[0m[3054] HTTP Error [31merr[0m=No such image: mesos-20160114-153418-1674208327-5050-3798-S0.c5c35d59-1318-4a96-b850-b0b788815f1b [31mstatusCode[0m=404 [34mINFO[0m[3054] GET /v1.15/containers/weave/json [34mINFO[0m[3054] POST /v1.21/containers/create?name=mesos-20160114-153418-1674208327-5050-3798-S0.c5c35d59-1318-4a96-b850-b0b788815f1b [34mINFO[0m[3054] POST /v1.21/containers/cf7fc7c483248e30f1dbb5990ce8874f2bfbe936c74eed1fc9af6f70653a1d47/attach?stderr=1&stdout=1&stream=1 [34mINFO[0m[3054] POST /v1.21/containers/cf7fc7c483248e30f1dbb5990ce8874f2bfbe936c74eed1fc9af6f70653a1d47/start [34mINFO[0m[3054] GET /v1.15/containers/cf7fc7c483248e30f1dbb5990ce8874f2bfbe936c74eed1fc9af6f70653a1d47/json [34mINFO[0m[3054] GET /v1.15/containers/cf7fc7c483248e30f1dbb5990ce8874f2bfbe936c74eed1fc9af6f70653a1d47/json [34mINFO[0m[3054] GET /v1.15/containers/weave/json [34mINFO[0m[3054] GET /v1.15/containers/cf7fc7c483248e30f1dbb5990ce8874f2bfbe936c74eed1fc9af6f70653a1d47/json [34mINFO[0m[3054] GET /v1.15/containers/cf7fc7c483248e30f1dbb5990ce8874f2bfbe936c74eed1fc9af6f70653a1d47/json [34mINFO[0m[3054] GET /v1.15/containers/weave/json [34mINFO[0m[3054] GET /v1.15/containers/cf7fc7c483248e30f1dbb5990ce8874f2bfbe936c74eed1fc9af6f70653a1d47/json [34mINFO[0m[3054] GET /v1.21/containers/mesos-20160114-153418-1674208327-5050-3798-S0.c5c35d59-1318-4a96-b850-b0b788815f1b/json [34mINFO[0m[3111] GET /v1.21/containers/json [34mINFO[0m[3120] GET /v1.21/containers/cf7/json [34mINFO[0m[3120] GET /v1.21/containers/cf7/logs?stderr=1&stdout=1&tail=all [34mINFO[0m[3153] GET /containers/json [34mINFO[0m[3153] GET /containers/cf7fc7c483248e30f1dbb5990ce8874f2bfbe936c74eed1fc9af6f70653a1d47/json [34mINFO[0m[3153] GET /containers/56111722ef83134f6c73c5e3aa27de3f34f1fa73efdec3257c3cc9b283e40729/json [34mINFO[0m[3153] GET /containers/b9e9b79a8d431455bfcaafca59223017b2470a47a294075d656eeffdaaefad33/json [34mINFO[0m[3175] GET /containers/json [34mINFO[0m[3175] GET /containers/cf7fc7c483248e30f1dbb5990ce8874f2bfbe936c74eed1fc9af6f70653a1d47/json [34mINFO[0m[3175] GET /containers/56111722ef83134f6c73c5e3aa27de3f34f1fa73efdec3257c3cc9b283e40729/json [34mINFO[0m[3175] GET /containers/b9e9b79a8d431455bfcaafca59223017b2470a47a294075d656eeffdaaefad33/json *[34mINFO[0m[3175] POST /v1.21/containers/mesos-20160114-153418-1674208327-5050-3798-S0.c5c35d59-1318-4a96-b850-b0b788815f1b/stop* ?t=15 *[31mERRO[0m[3175] attach: stdout: write unix @: broken pipe* *[34mINFO[0m[3190] Container cf7fc7c483248e30f1dbb5990ce8874f2bfbe936c74eed1fc9af6f70653a1d47 failed to exit within 15 seconds of SIGTERM - using the force * *[34mINFO[0m[3200] Container cf7fc7c48324 failed to exit within 10 seconds of kill - trying direct SIGKILL * *STDOUT from Mesos:* *--container="mesos-20160114-153418-1674208327-5050-3798-S0.c5c35d59-1318-4a96-b850-b0b788815f1b" *--docker="/usr/local/ecxmcc/weaveShim" --help="false" --initialize_driver_logging="true" --logbufsecs="0" --logging_level="INFO" --mapped_directory="/mnt/mesos/sandbox" --quiet="false" --sandbox_directory="/tmp/mesos/slaves/20160114-153418-1674208327-5050-3798-S0/frameworks/20160114-103414-1674208327-5050-3293-/executors/ecxconfigdb.c3cae92e-baff-11e5-8afe-82f779ac6285/runs/c5c35d59-1318-4a96-b850-b0b788815f1b" --stop_timeout="15secs" --container="mesos-20160114-153418-1674208327-5050-3798-S0.c5c35d59-1318-4a96-b850-b0b788815f1b" --docker="/usr/local/ecxmcc/weaveShim" --help="false" --initialize_driver_logging="true" --logbufsecs="0" --logging_level="INFO" --mapped_directory="/mnt/mesos/sandbox" --quiet="false" --sandbox_directory="/tmp/mesos/slaves/20160114-153418-1674208327-5050-3798-S0/frameworks/20160114-103414-1674208327-5050-3293-/executors/ecxconfigdb.c3cae92e-baff-11e5-8afe-82f779ac6285/r
Re: Help needed (alas, urgently)
Hey Tim, Thank you very much for your reply. Yes, I am in the midst of trying to reproduce the problem. If successful (so to speak), I will do as you ask. Cordially, Paul On Thu, Jan 14, 2016 at 3:19 PM, Tim Chen wrote: > Hi Paul, > > Looks like we've already issued the docker stop as you seen in the ps > output, but the containers are still running. Can you look at the Docker > daemon logs and see what's going on there? > > And also can you also try to modify docker_stop_timeout to 0 so that we > SIGKILL the containers right away, and see if this still happens? > > Tim > > > > On Thu, Jan 14, 2016 at 11:52 AM, Paul Bell wrote: > >> Hi All, >> >> It's been quite some time since I've posted here and that's chiefly >> because up until a day or two ago, things were working really well. >> >> I actually may have posted about this some time back. But then the >> problem seemed more intermittent. >> >> In summa, several "docker stops" don't work, i.e., the containers are not >> stopped. >> >> Deployment: >> >> one Ubuntu VM (vmWare) LTS 14.04 with kernel 3.19 >> Zookeeper >> Mesos-master (0.23.0) >> Mesos-slave (0.23.0) >> Marathon (0.10.0) >> Docker 1.9.1 >> Weave 1.1.0 >> Our application contains which include >> MongoDB (4) >> PostGres >> ECX (our product) >> >> The only thing that's changed at all in the config above is the version >> of Docker. Used to be 1.6.2 but I today upgraded it hoping to solve the >> problem. >> >> >> My automater program stops the application by sending Marathon an "http >> delete" for each running up. Every now & then (reliably reproducible today) >> not all containers get stopped. Most recently, 3 containers failed to stop. >> >> Here are the attendant phenomena: >> >> Marathon shows the 3 applications in deployment mode (presumably >> "deployment" in the sense of "stopping") >> >> *ps output:* >> >> root@71:~# ps -ef | grep docker >> root 3823 1 0 13:55 ?00:00:02 /usr/bin/docker daemon -H >> unix:///var/run/docker.sock -H tcp://0.0.0.0:4243 >> root 4967 1 0 13:57 ?00:00:01 /usr/sbin/mesos-slave >> --master=zk://71.100.202.99:2181/mesos --log_dir=/var/log/mesos >> --containerizers=docker,mesos --docker=/usr/local/ecxmcc/weaveShim >> --docker_stop_timeout=15secs --executor_registration_timeout=5mins >> --hostname=71.100.202.99 --ip=71.100.202.99 >> --attributes=hostType:ecx,shard1 --resources=ports:[31000-31999,8443-8443] >> root 5263 3823 0 13:57 ?00:00:00 docker-proxy -proto tcp >> -host-ip 0.0.0.0 -host-port 6783 -container-ip 172.17.0.2 -container-port >> 6783 >> root 5271 3823 0 13:57 ?00:00:00 docker-proxy -proto udp >> -host-ip 0.0.0.0 -host-port 6783 -container-ip 172.17.0.2 -container-port >> 6783 >> root 5279 3823 0 13:57 ?00:00:00 docker-proxy -proto tcp >> -host-ip 172.17.0.1 -host-port 53 -container-ip 172.17.0.2 -container-port >> 53 >> root 5287 3823 0 13:57 ?00:00:00 docker-proxy -proto udp >> -host-ip 172.17.0.1 -host-port 53 -container-ip 172.17.0.2 -container-port >> 53 >> root 7119 4967 0 14:00 ?00:00:01 mesos-docker-executor >> --container=mesos-20160114-135722-1674208327-5050-4917-S0.bfc5a419-30f8-43f7-af2f-5582394532f2 >> --docker=/usr/local/ecxmcc/weaveShim --help=false >> --mapped_directory=/mnt/mesos/sandbox >> --sandbox_directory=/tmp/mesos/slaves/20160114-135722-1674208327-5050-4917-S0/frameworks/20160114-103414-1674208327-5050-3293-/executors/ecxconfigdb.1e6e0779-baf1-11e5-8c36-522bd4cc5ea9/runs/bfc5a419-30f8-43f7-af2f-5582394532f2 >> --stop_timeout=15secs >> root 7378 4967 0 14:00 ?00:00:01 mesos-docker-executor >> --container=mesos-20160114-135722-1674208327-5050-4917-S0.9b700cdc-3d29-49b7-a7fc-e543a91f7b89 >> --docker=/usr/local/ecxmcc/weaveShim --help=false >> --mapped_directory=/mnt/mesos/sandbox >> --sandbox_directory=/tmp/mesos/slaves/20160114-135722-1674208327-5050-4917-S0/frameworks/20160114-103414-1674208327-5050-3293-/executors/ecxcatalogdbs1.25911dda-baf1-11e5-8c36-522bd4cc5ea9/runs/9b700cdc-3d29-49b7-a7fc-e543a91f7b89 >> --stop_timeout=15secs >> root 7640 4967 0 14:01 ?00:00:01 mesos-docker-executor >> --container=mesos-20160114-135722-1674208327-5050-4917-S0.d7d861d3-cfc9-424d-b341-0631edea4298 >> --docker=/usr/local/ecxmcc/weaveShim --help=false >> --mapped_directory=/mnt/mesos/sandbox >> --sandbox_directory=/tmp/mesos/slaves/20160114-135722-1674208327-5050-4917-S0/frameworks/20160114-103414-1674208327-5050-3293-/executors/mongoconfig.2cb9163b-baf1-11e5-8c36-522bd4cc5ea9/runs/d7d861d3-cfc9-424d-b341-0631edea4298 >> --stop_timeout=15secs >> *root 9696 9695 0 14:06 ?00:00:00 /usr/bin/docker stop -t >> 15 >> mesos-20160114-135722-1674208327-5050-4917-S0.d7d861d3-cfc9-424d-b341-0631edea4298* >> *root 9709 9708 0 14:06 ?00:00:00 /usr/bin/docker stop -t >> 15 >> mesos-20160114-135722-1674208327-5050-4917-S0.9b700cdc-3d29-49b7-a7fc-e543a91f7b89*
Re: Help needed (alas, urgently)
Hi Paul, Looks like we've already issued the docker stop as you seen in the ps output, but the containers are still running. Can you look at the Docker daemon logs and see what's going on there? And also can you also try to modify docker_stop_timeout to 0 so that we SIGKILL the containers right away, and see if this still happens? Tim On Thu, Jan 14, 2016 at 11:52 AM, Paul Bell wrote: > Hi All, > > It's been quite some time since I've posted here and that's chiefly > because up until a day or two ago, things were working really well. > > I actually may have posted about this some time back. But then the problem > seemed more intermittent. > > In summa, several "docker stops" don't work, i.e., the containers are not > stopped. > > Deployment: > > one Ubuntu VM (vmWare) LTS 14.04 with kernel 3.19 > Zookeeper > Mesos-master (0.23.0) > Mesos-slave (0.23.0) > Marathon (0.10.0) > Docker 1.9.1 > Weave 1.1.0 > Our application contains which include > MongoDB (4) > PostGres > ECX (our product) > > The only thing that's changed at all in the config above is the version of > Docker. Used to be 1.6.2 but I today upgraded it hoping to solve the > problem. > > > My automater program stops the application by sending Marathon an "http > delete" for each running up. Every now & then (reliably reproducible today) > not all containers get stopped. Most recently, 3 containers failed to stop. > > Here are the attendant phenomena: > > Marathon shows the 3 applications in deployment mode (presumably > "deployment" in the sense of "stopping") > > *ps output:* > > root@71:~# ps -ef | grep docker > root 3823 1 0 13:55 ?00:00:02 /usr/bin/docker daemon -H > unix:///var/run/docker.sock -H tcp://0.0.0.0:4243 > root 4967 1 0 13:57 ?00:00:01 /usr/sbin/mesos-slave > --master=zk://71.100.202.99:2181/mesos --log_dir=/var/log/mesos > --containerizers=docker,mesos --docker=/usr/local/ecxmcc/weaveShim > --docker_stop_timeout=15secs --executor_registration_timeout=5mins > --hostname=71.100.202.99 --ip=71.100.202.99 > --attributes=hostType:ecx,shard1 --resources=ports:[31000-31999,8443-8443] > root 5263 3823 0 13:57 ?00:00:00 docker-proxy -proto tcp > -host-ip 0.0.0.0 -host-port 6783 -container-ip 172.17.0.2 -container-port > 6783 > root 5271 3823 0 13:57 ?00:00:00 docker-proxy -proto udp > -host-ip 0.0.0.0 -host-port 6783 -container-ip 172.17.0.2 -container-port > 6783 > root 5279 3823 0 13:57 ?00:00:00 docker-proxy -proto tcp > -host-ip 172.17.0.1 -host-port 53 -container-ip 172.17.0.2 -container-port > 53 > root 5287 3823 0 13:57 ?00:00:00 docker-proxy -proto udp > -host-ip 172.17.0.1 -host-port 53 -container-ip 172.17.0.2 -container-port > 53 > root 7119 4967 0 14:00 ?00:00:01 mesos-docker-executor > --container=mesos-20160114-135722-1674208327-5050-4917-S0.bfc5a419-30f8-43f7-af2f-5582394532f2 > --docker=/usr/local/ecxmcc/weaveShim --help=false > --mapped_directory=/mnt/mesos/sandbox > --sandbox_directory=/tmp/mesos/slaves/20160114-135722-1674208327-5050-4917-S0/frameworks/20160114-103414-1674208327-5050-3293-/executors/ecxconfigdb.1e6e0779-baf1-11e5-8c36-522bd4cc5ea9/runs/bfc5a419-30f8-43f7-af2f-5582394532f2 > --stop_timeout=15secs > root 7378 4967 0 14:00 ?00:00:01 mesos-docker-executor > --container=mesos-20160114-135722-1674208327-5050-4917-S0.9b700cdc-3d29-49b7-a7fc-e543a91f7b89 > --docker=/usr/local/ecxmcc/weaveShim --help=false > --mapped_directory=/mnt/mesos/sandbox > --sandbox_directory=/tmp/mesos/slaves/20160114-135722-1674208327-5050-4917-S0/frameworks/20160114-103414-1674208327-5050-3293-/executors/ecxcatalogdbs1.25911dda-baf1-11e5-8c36-522bd4cc5ea9/runs/9b700cdc-3d29-49b7-a7fc-e543a91f7b89 > --stop_timeout=15secs > root 7640 4967 0 14:01 ?00:00:01 mesos-docker-executor > --container=mesos-20160114-135722-1674208327-5050-4917-S0.d7d861d3-cfc9-424d-b341-0631edea4298 > --docker=/usr/local/ecxmcc/weaveShim --help=false > --mapped_directory=/mnt/mesos/sandbox > --sandbox_directory=/tmp/mesos/slaves/20160114-135722-1674208327-5050-4917-S0/frameworks/20160114-103414-1674208327-5050-3293-/executors/mongoconfig.2cb9163b-baf1-11e5-8c36-522bd4cc5ea9/runs/d7d861d3-cfc9-424d-b341-0631edea4298 > --stop_timeout=15secs > *root 9696 9695 0 14:06 ?00:00:00 /usr/bin/docker stop -t > 15 > mesos-20160114-135722-1674208327-5050-4917-S0.d7d861d3-cfc9-424d-b341-0631edea4298* > *root 9709 9708 0 14:06 ?00:00:00 /usr/bin/docker stop -t > 15 > mesos-20160114-135722-1674208327-5050-4917-S0.9b700cdc-3d29-49b7-a7fc-e543a91f7b89* > *root 9720 9719 0 14:06 ?00:00:00 /usr/bin/docker stop -t > 15 > mesos-20160114-135722-1674208327-5050-4917-S0.bfc5a419-30f8-43f7-af2f-5582394532f2* > > *docker ps output:* > > root@71:~# docker ps > CONTAINER IDIMAGE COMMAND > CREATED STATUS PORTS >
Help needed (alas, urgently)
Hi All, It's been quite some time since I've posted here and that's chiefly because up until a day or two ago, things were working really well. I actually may have posted about this some time back. But then the problem seemed more intermittent. In summa, several "docker stops" don't work, i.e., the containers are not stopped. Deployment: one Ubuntu VM (vmWare) LTS 14.04 with kernel 3.19 Zookeeper Mesos-master (0.23.0) Mesos-slave (0.23.0) Marathon (0.10.0) Docker 1.9.1 Weave 1.1.0 Our application contains which include MongoDB (4) PostGres ECX (our product) The only thing that's changed at all in the config above is the version of Docker. Used to be 1.6.2 but I today upgraded it hoping to solve the problem. My automater program stops the application by sending Marathon an "http delete" for each running up. Every now & then (reliably reproducible today) not all containers get stopped. Most recently, 3 containers failed to stop. Here are the attendant phenomena: Marathon shows the 3 applications in deployment mode (presumably "deployment" in the sense of "stopping") *ps output:* root@71:~# ps -ef | grep docker root 3823 1 0 13:55 ?00:00:02 /usr/bin/docker daemon -H unix:///var/run/docker.sock -H tcp://0.0.0.0:4243 root 4967 1 0 13:57 ?00:00:01 /usr/sbin/mesos-slave --master=zk://71.100.202.99:2181/mesos --log_dir=/var/log/mesos --containerizers=docker,mesos --docker=/usr/local/ecxmcc/weaveShim --docker_stop_timeout=15secs --executor_registration_timeout=5mins --hostname=71.100.202.99 --ip=71.100.202.99 --attributes=hostType:ecx,shard1 --resources=ports:[31000-31999,8443-8443] root 5263 3823 0 13:57 ?00:00:00 docker-proxy -proto tcp -host-ip 0.0.0.0 -host-port 6783 -container-ip 172.17.0.2 -container-port 6783 root 5271 3823 0 13:57 ?00:00:00 docker-proxy -proto udp -host-ip 0.0.0.0 -host-port 6783 -container-ip 172.17.0.2 -container-port 6783 root 5279 3823 0 13:57 ?00:00:00 docker-proxy -proto tcp -host-ip 172.17.0.1 -host-port 53 -container-ip 172.17.0.2 -container-port 53 root 5287 3823 0 13:57 ?00:00:00 docker-proxy -proto udp -host-ip 172.17.0.1 -host-port 53 -container-ip 172.17.0.2 -container-port 53 root 7119 4967 0 14:00 ?00:00:01 mesos-docker-executor --container=mesos-20160114-135722-1674208327-5050-4917-S0.bfc5a419-30f8-43f7-af2f-5582394532f2 --docker=/usr/local/ecxmcc/weaveShim --help=false --mapped_directory=/mnt/mesos/sandbox --sandbox_directory=/tmp/mesos/slaves/20160114-135722-1674208327-5050-4917-S0/frameworks/20160114-103414-1674208327-5050-3293-/executors/ecxconfigdb.1e6e0779-baf1-11e5-8c36-522bd4cc5ea9/runs/bfc5a419-30f8-43f7-af2f-5582394532f2 --stop_timeout=15secs root 7378 4967 0 14:00 ?00:00:01 mesos-docker-executor --container=mesos-20160114-135722-1674208327-5050-4917-S0.9b700cdc-3d29-49b7-a7fc-e543a91f7b89 --docker=/usr/local/ecxmcc/weaveShim --help=false --mapped_directory=/mnt/mesos/sandbox --sandbox_directory=/tmp/mesos/slaves/20160114-135722-1674208327-5050-4917-S0/frameworks/20160114-103414-1674208327-5050-3293-/executors/ecxcatalogdbs1.25911dda-baf1-11e5-8c36-522bd4cc5ea9/runs/9b700cdc-3d29-49b7-a7fc-e543a91f7b89 --stop_timeout=15secs root 7640 4967 0 14:01 ?00:00:01 mesos-docker-executor --container=mesos-20160114-135722-1674208327-5050-4917-S0.d7d861d3-cfc9-424d-b341-0631edea4298 --docker=/usr/local/ecxmcc/weaveShim --help=false --mapped_directory=/mnt/mesos/sandbox --sandbox_directory=/tmp/mesos/slaves/20160114-135722-1674208327-5050-4917-S0/frameworks/20160114-103414-1674208327-5050-3293-/executors/mongoconfig.2cb9163b-baf1-11e5-8c36-522bd4cc5ea9/runs/d7d861d3-cfc9-424d-b341-0631edea4298 --stop_timeout=15secs *root 9696 9695 0 14:06 ?00:00:00 /usr/bin/docker stop -t 15 mesos-20160114-135722-1674208327-5050-4917-S0.d7d861d3-cfc9-424d-b341-0631edea4298* *root 9709 9708 0 14:06 ?00:00:00 /usr/bin/docker stop -t 15 mesos-20160114-135722-1674208327-5050-4917-S0.9b700cdc-3d29-49b7-a7fc-e543a91f7b89* *root 9720 9719 0 14:06 ?00:00:00 /usr/bin/docker stop -t 15 mesos-20160114-135722-1674208327-5050-4917-S0.bfc5a419-30f8-43f7-af2f-5582394532f2* *docker ps output:* root@71:~# docker ps CONTAINER IDIMAGE COMMAND CREATED STATUS PORTS NAMES 5abafbfe7de2mongo:2.6.8 "/w/w /entrypoint.sh " 11 minutes ago Up 11 minutes 27017/tcp mesos-20160114-135722-1674208327-5050-4917-S0.d7d861d3-cfc9-424d-b341-0631edea4298 a8449682ca2emongo:2.6.8 "/w/w /entrypoint.sh " 11 minutes ago Up 11 minutes 27017/tcp mesos-20160114-135722-1674208327-5050-4917-S0.9b700cdc-3d29-49b7-a7fc-e543a91f7b89 3b956457374bmongo:2.6.8 "/w/w /entrypoint.sh " 11 minutes ago Up 11