I thought signal escalation as per-executor or actually everywhere where we execute a command info as a subprocess. The new grace period is meant as the time an executor has to finish off it's things - changing the other timeouts had to be done as they will in most cases be shorter. For custom executors, it is up to themselves to honor the timeout; or else, the executor process will kill it after timeout + delta time.
Ben, are you thinking of a more generalized finalization mechanism (pluggable, programmable)? Niklas On 11 November 2014 10:34, Alex Rukletsov <a...@mesosphere.io> wrote: > Ben, > > there are two scenarios: executor shutdown and killTask() in > CommandExecutor. For the first use case, each custom executor is affected > through the ExecutorProcess, that means two levels are involved > (containerizer and executor) and should be synchronized. > > In the second scenario, each task is tied to its own CommandExecutor, > therefore killing a task implies killing its executor. In this case, grace > shutdown period becomes also a signal escalation timeout and conflating > them together, I think, is a good idea. The proposed design doc is an > effort to align timeouts along the chain from slave to CommanExecutor. > > If I understand you correctly, we want to shutdown any executor (task) > gracefully, and do not tie grace period to CommandExecutor only. A good > example pointed by Ankur Chauhan is MESOS-1925 > <https://issues.apache.org/jira/browse/MESOS-1925>: we can reuse reuse the > same grace shutdown flag for dockers. And if we later enable frameworks to > adjust timeouts for its tasks (or executors, to be precise), we will be > able to align the timeout used by docker finalization with the timeout in > docker container. > > On Mon, Nov 10, 2014 at 10:00 PM, Benjamin Mahler < > benjamin.mah...@gmail.com > > wrote: > > > I'm guessing most of the motivation here is actually for task killing > > escalation in the command executor? The shutdown grace period was > designed > > for executor shutdown only, which today occurs only when the framework is > > being shutdown (or recovery is cleaning up), or in the future, when > > frameworks ask to shutdown a specific executor. > > > > In the case of the command executor, the slave won't do any escalation > when > > a killTask arrives, since it's not trying to shutdown the executor. For > > simplicity (I'm guessing), we conflated the executor shutdown grace > period, > > with the killTask signal escalation in the command executor. > > > > So, I'm still trying to figure out the concrete use case here, is it that > > you have command-tasks that implement a clean shutdown driven by SIGTERM? > > Going forward, is that enough or would we want a more general notion of > > "Finalization" (e.g. driven by HTTP, or SIGTERM, or subprocess, etc), > much > > like the generic health checking that was added. > > > > On Mon, Nov 10, 2014 at 8:08 AM, Alex Rukletsov <a...@mesosphere.io> > > wrote: > > > > > Hi all, > > > > > > I would like to share the design doc for configurable grace period > > > < > > > > > > https://docs.google.com/document/d/1_b3OPv3tjkub1T6VhQ27GnDfbVjnJ6IQ4ufPQhV1HM8/edit?usp=sharing > > > >. > > > The doc describes two approaches to calculate nested grace periods, > > points > > > out implementation details and opens several design questions. > > > > > > I would highly appreciate any thoughts, ideas and suggestions! > > > > > > Thanks, > > > Alex > > > > > >