>Is there some way to be given control (a callback, or an "exit" routine) so that the container about to be nuked can be given a chance to exit gracefully? The default value of executor_shutdown_grace_period is 5 seconds, you could change it by specify the `--executor_shutdown_grace_period` flag when launch mesos agent.
>Are there other steps I can take to avoid this mildly calamitous occurrence? >mesos-slaves get shutdown Do you know where your mesos-master stuck when it happens? Any error log or related log about this? In addition, is there any log when mesos-slave shut down? On Wed, May 18, 2016 at 6:12 AM, Paul Bell <arach...@gmail.com> wrote: > Hi All, > > I probably have the following account partly wrong, but let me present it > just the same and those who know better can correct me as needed. > > I've an application that runs several MongoDB shards, each a Dockerized > container, each on a distinct node (VM); in fact, some of the VMs are on > separate ESXi hosts. > > I've lately seen situations where, because of very slow disks for the > database, the following sequence occurs (I think): > > 1. Linux (Ubuntu 14.04 LTS) virtual memory manager hits thresholds > defined by vm.dirty_background_ratio and/or vm.dirty_ratio (probably both) > 2. Synchronous flushing of many, many pages occurs, writing to a slow > disk > 3. (Around this time one might see in /var/log/syslog "task X blocked > for more than 120 seconds" for all kinds of tasks, including mesos-master) > 4. mesos-slaves get shutdown (this is the part I'm unclear about; but > I am quite certain that on 2 nodes the executors and their in-flight > MongoDB tasks got zapped because I can see that Marathon restarted them). > > The consequences of this are a corrupt MongoDB database. In the case at > hand, the job had run for over 50 hours, processing close to 120 million > files. > > Steps I've taken so far to remedy include: > > - tune vm.dirty_background_ratio and vm.dirty_ratio down, > respectively, to 5 and 10 (from 10 and 20). The intent here is to tolerate > more frequent, smaller flushes and thus avoid less frequent massive flushes > that suspend threads for very long periods. > - increase agent ping timeout to 10 minutes (every 30 seconds, 20 > times) > > So the questions are: > > - Is there some way to be given control (a callback, or an "exit" > routine) so that the container about to be nuked can be given a chance to > exit gracefully? > - Are there other steps I can take to avoid this mildly calamitous > occurrence? > - (Also, I'd be grateful for more clarity on anything in steps 1-4 > above that is a bit hand-wavy!) > > As always, thanks. > > -Paul > > > -- Best Regards, Haosdent Huang