Hi all, I'm in the midst of steeping myself in systemd docs as I prepare to
face lift a slew of services for Debian Jessie updates.

As I read through things I'm starting to think through a number of new ways
I could potentially reorganize some of our services, which is cool. With my
ideas though I think I'm finding a few gaps in either my understanding or
systemd capabilities, so I wanted to send a few questions to the list.
Hopefully this is the right place.

The first should hopefully be a bit of a softball:

With .service units one can specify OnFailure and other sorts of restart
behaviors, including thresholds and backoffs for when to stop retrying and
what to do then. Essentially a lightweight service problem escalation
procedure.

However, in reading systemd-system.conf, I don't see any way to specify
something like DefaultOnFailure behavior for what to do on failure, perhaps
after some simple restart attempts, for all services.  Seems like it can
only be done on a per unit basis, no?

Ideally, I'd like to be able to do something very simply like, declare
if any service fails to restart itself or does so too often and enters a
hard failure state, then systemd should (attempt to) fire off an
escalation procedure unit like send a passive check status to Nagios or
send an email, accepting that such procedures may depend upon network
connectivity which may or may not be available (so maybe there's some
circular dependency issues to work through in such a scenario, but I
presume systemd already has facilities for handling that case, maybe via
OnFailureJobMode= settings).

Thoughts?

Thanks,
Brian
_______________________________________________
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel

Reply via email to