David Robinson created MESOS-5376:
-------------------------------------

             Summary: Add systemd watchdog support
                 Key: MESOS-5376
                 URL: https://issues.apache.org/jira/browse/MESOS-5376
             Project: Mesos
          Issue Type: Improvement
            Reporter: David Robinson


It would be great if Mesos had support for systemd's 
[watchdog|http://0pointer.de/blog/projects/watchdog.html]. Users would 
typically use a supervisor like [monit|https://mmonit.com/monit/] to check the 
agent/master's /health endpoint and restart upon consecutive failures. Systemd 
doesn't support polling services, it uses a watchdog to communicate liveliness 
instead. Supervisor solutions like monit could be replaced with systemd if 
mesos had watchdog support. Note that simply restarting the service upon 
failure (ie, when the process exits) is not sufficient -- a deadlock within 
mesos would not cause the process to exit but a watchdog could detect this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to