----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/25434/#review52726 -----------------------------------------------------------
I think this will work (and good job on style btw!). The biggest piece that I find missing is testing; we need tests to verify that the new escalation logic works as we expect. Tests in case of long (more than 3 deltas) and short (smaller than delta) timeouts and combinations of the different levels responding or not-responding to SIGTERM and so on. Also, we should make sure that there are no surprises when setting the grace period. I found it a bit surprising that small timeouts gets chopped in halfs and thirds. What happens for 0 second timeouts? Would it make sense to introduce a helper class that centralize the timeout logic/computation? We can add more verification there too and can be used to test the logic directly from our unit tests. Does this make sense? src/exec/exec.cpp <https://reviews.apache.org/r/25434/#comment91705> Small nit: We try to keep the variable names short and concise. I would have dropped the 'mesos' prefix. src/launcher/executor.cpp <https://reviews.apache.org/r/25434/#comment91708> Till raised an issue with namespace aliasing - did you guys sort that out? I am not a fan either. src/slave/constants.hpp <https://reviews.apache.org/r/25434/#comment91707> Small bit: max columns for comments are 70: http://mesos.apache.org/documentation/latest/mesos-c++-style-guide/ src/tests/containerizer.cpp <https://reviews.apache.org/r/25434/#comment91706> Why 3 seconds? - Niklas Nielsen On Sept. 9, 2014, 5:54 a.m., Alexander Rukletsov wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/25434/ > ----------------------------------------------------------- > > (Updated Sept. 9, 2014, 5:54 a.m.) > > > Review request for mesos, Niklas Nielsen, Till Toenshoff, and Timothy St. > Clair. > > > Bugs: MESOS-1571 > https://issues.apache.org/jira/browse/MESOS-1571 > > > Repository: mesos-git > > > Description > ------- > > The configurable slave's executor_shutdown_grace_period flag is propagated to > Executor and CommandExecutor through an environment variable. Shutdown > timeout in Executor and signal escalation timeout in CommandExecutor are now > dependent on this flag. Each nested timeout is somewhat shorter than the > parent one. > > > Diffs > ----- > > src/exec/exec.cpp 36d1778 > src/launcher/executor.cpp 12ac14b > src/slave/constants.hpp 9030871 > src/slave/constants.cpp e1da5c0 > src/slave/containerizer/containerizer.hpp 8a66412 > src/slave/containerizer/containerizer.cpp 0254679 > src/slave/containerizer/docker.cpp 0febbac > src/slave/containerizer/external_containerizer.cpp efbc68f > src/slave/containerizer/mesos/containerizer.cpp 9d08329 > src/slave/flags.hpp 21e0021 > src/tests/containerizer.cpp a17e1e0 > > Diff: https://reviews.apache.org/r/25434/diff/ > > > Testing > ------- > > make check (OS X 10.9.4; Ubuntu 14.04 amd64) > > > Thanks, > > Alexander Rukletsov > >