[ https://issues.apache.org/jira/browse/MESOS-5376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ian Downes updated MESOS-5376: ------------------------------ Assignee: Lawrence Wu > Add systemd watchdog support > ---------------------------- > > Key: MESOS-5376 > URL: https://issues.apache.org/jira/browse/MESOS-5376 > Project: Mesos > Issue Type: Improvement > Reporter: David Robinson > Assignee: Lawrence Wu > > It would be great if Mesos had support for systemd's > [watchdog|http://0pointer.de/blog/projects/watchdog.html]. Users would > typically use a supervisor like [monit|https://mmonit.com/monit/] to check > the agent/master's /health endpoint and restart upon consecutive failures. > Systemd doesn't support polling services, it uses a watchdog to communicate > liveliness instead. Supervisor solutions like monit could be replaced with > systemd if mesos had watchdog support. Note that simply restarting the > service upon failure (ie, when the process exits) is not sufficient -- a > deadlock within mesos would not cause the process to exit but a watchdog > could detect this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)