On 14/11/16 11:44, David Sommerseth wrote: > On 12/11/16 16:00, Gert Doering wrote: >> Hi, >> >> On Fri, Nov 11, 2016 at 01:35:57PM +0100, David Sommerseth >> wrote: >>> We can of course investigate if we should enable systemd to >>> restart OpenVPN, at least the server profile, if it dies >>> unexpectedly. Currently, I am not fully convinced we want >>> that. >> >> I think this would be useful to have. Server processes are >> expected to be there - if they are not, it needs to be >> investigated why not, but with some reasonable delay, they should >> be restarted. > > Let's take the easiest part first ... what is a reasonable delay? > RestartSec= defines how long systemd should wait before restarting > the service. Default is 100ms. Would 30 seconds, 1 minute, 5 > minutes, 15 minutes or 1 hour be a reasonable delay? Or another > value? I honestly don't know. > > > So to the slightly harder part ... under which conditions should > systemd restart the server? The Restart= setting can have a few > alternatives. > > - on-success? Only when openvpn exits with exit code 0. Probably > not. > > - on-failure? When openvpn exits with exit code != 0 or "unclean" > exit signals. [1] Probably. > > - on-abnormal? When openvpn is killed by a signal, a timeout event > or watchdog timeout event happens. [1] Perhaps. > > - on-abort? Only when an uncaught signal not considered as a > "clean exit" occurs. Perhaps. > > - on-watchdog? Only when a watchdog timeout occurs. Most likely > not. > > - always? Whenever OpenVPN stops running regardless of reason > why it stopped, it will be restarted. Most likely not what we > want. > > - no? This is the default, which we already have. > > And you can only choose one of these alternatives. > > In regards to those triggers including a watchdog, that needs the > OpenVPN process to signal to the systemd daemon that it will > provide a watchdog signal. So unless we add that in our OpenVPN > code, watchdog is not relevant for us. > > The systemd.service man page [2] recommends: > > Setting this to on-failure is the recommended choice for > long-running services, in order to increase reliability by > attempting automatic recovery from errors. For services that shall > be able to terminate on their own choice (and avoid immediate > restarting), on-abnormal is an alternative choice. > > [1] Also includes terminated by signal, including core dump but > excludes SIGHUP, SIGINT, SIGTERM adn SIGPIPE (those four signals > are considered clean exits) > > [2] > <https://www.freedesktop.org/software/systemd/man/systemd.service.html#Restart=>
Btw, > I forgot to mention that there are more tweaks possible, to more fine grained define successful and failing exit codes and signals. This is done through the SuccessExitStatus= (what is considered successful exiting), RestartPreventExitStatus= (do not restart in these scenarios) and RestartForceExitStatus= (always restart when these things happens). All of them takes a list of exit codes and/or signals. I think we should avoid tweaking them right now, unless we have a clear exit strategy in our code which falls outside of the default - where exit code 0, SIGHUP, SIGINT, SIGTERM adn SIGPIPE are the successful exits all others are failed exits. -- kind regards, David Sommerseth OpenVPN Technologies, Inc
signature.asc
Description: OpenPGP digital signature
------------------------------------------------------------------------------ Developer Access Program for Intel Xeon Phi Processors Access to Intel Xeon Phi processor-based developer platforms. With one year of Intel Parallel Studio XE. Training and support from Colfax. Order your platform today. http://sdm.link/xeonphi
_______________________________________________ Openvpn-devel mailing list Openvpn-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/openvpn-devel