Hi, On 06-02-18 06:53, Selva Nair wrote: > On Mon, Feb 5, 2018 at 7:52 PM, Jonathan K. Bullard <jkbull...@gmail.com> > wrote: >> Hi, I'd like to reopen this patch -- it seems to have gotten lost. >> >> The patch is so old the line numbers are wrong but the code doesn't >> seem to have changed. >> >> I'm top-posting because this thread doesn't show up in the SourceForge >> archive[1] (???) so I've extracted the relevant parts below. There was >> some top-posting but nothing was really out-of-order, so I've tried to >> make the thread intelligible and include everything relevant below. >> >> On Tue, Apr 12, 2016 at 11:42 AM, Fish <fish.t...@gmail.com> wrote: >>> In `link_socket_init_phase2`, it used to be the case that any received >>> signal will be overwritten by SIGUSR1 if the socket cannot be created >>> (e.g. when DNS resolution fails), which consequently prevents OpenVPN from >>> responding to SIGTERM and exiting. This patch adds an additional check of >>> whether the received signal during DNS resolution is SIGTERM or not, and >>> prematurely exits from `link_socket_init_phase2` when receiving a SIGTERM. >>> --- >>> src/openvpn/socket.c | 5 +++++ >>> 1 file changed, 5 insertions(+) >>> >>> diff --git a/src/openvpn/socket.c b/src/openvpn/socket.c >>> index 9bcf4d4..5e8abe1 100644 >>> --- a/src/openvpn/socket.c >>> +++ b/src/openvpn/socket.c >>> @@ -1905,6 +1905,11 @@ link_socket_init_phase2 (struct link_socket *sock, >>> } >>> } >>> >>> + if (sig_info->signal_received == SIGTERM) >>> + { >>> + goto done; >>> + } >>> + >>> /* Socket still undefined, give a warning and abort connection */ >>> if (sock->sd == SOCKET_UNDEFINED) >>> { >> >> On Tue, Apr 12, 2016 at 11:46 AM, Gert Doering <g...@greenie.muc.de> wrote: >>> That is the "on DNS failure, the client loops forever and is not killable" >>> problem, right? >>> >>> (I'm not sure I'm reading the description right, to understand the >>> actual issue this is fixing - but if I'm reading it right, then this >>> makes sense :-) - what about SIGINT?) >> >> >> On Tue, Apr 12, 2016 at 11:48 AM, Fish Wang <fish.t...@gmail.com> wrote: >>> >>> Right, it's for the "on DNS failure, the client loops forever and is not >>> killable" problem. >> >> >> On Tue, Apr 12, 2016 at 12:25 PM, Jonathan K. Bullard >> <jkbull...@gmail.com> wrote: >>> Feature ACK; this should make DNS hangs much easier to deal with for >>> GUIs such as Tunnelblick. >>> >>> The "hang forever" problem can be avoided by using --resolve-retry. >>> >> >> On Tue, Apr 12, 2016 at 1:18 PM, Arne Schwabe <a...@rfc2549.org> wrote: >>> Am 12.04.16 um 17:42 schrieb Fish: >>>> In `link_socket_init_phase2`, it used to be the case that any received >>>> signal will be overwritten by SIGUSR1 if the socket cannot be created >>>> (e.g. when DNS resolution fails) >>> >>> That should not happen in the first place, doesn't register_signal keep >>> care of not overwriting higher priority signals? >> >> On Tue, Apr 12, 2016 at 10:07 PM, Fish Wang <fish.t...@gmail.com> wrote: >>> >>> No, the patch won't fix that issue. I may send out another patch later to >>> fix that problem if I have free time this week. >> >> On Tue, Apr 12, 2016 at 10:11 PM, Fish Wang <fish.t...@gmail.com> wrote: >>> Check out the code in socket.c [1]. This is where the received signal is >>> overwritten by SIGUSR1. >>> >>> /* Socket still undefined, give a warning and abort connection */ >>> if (sock->sd == SOCKET_UNDEFINED) >>> { >>> msg (M_WARN, "Could not determine IPv4/IPv6 protocol"); >>> sig_info->signal_received = SIGUSR1; >>> goto done; >>> } >>> [1] >>> https://github.com/OpenVPN/openvpn/blob/master/src/openvpn/socket.c#L1909 >> >> Note that theses lines are immediately below the patch. The last three >> lines of the patch (which are not changed in the patch) are the same >> as the first three lines shown above. > > This jogs my memory. > > We do have signal overwrite issues in more than a couple of places in > the code and signal handling after getaddrinfo failure does need > fixing. I think this is a good opportunity to discuss whether we > continue tip-toeing around this or take a closer look at signal handling > and make some bolder changes. > > When this patch was being discussed 2 years ago I had looked into our > signal handling code. A number of issues and some solutions were > identified but never finished for one reason or other. My stab at > improving signal handling is in this feature branch: > https://github.com/selvanair/openvpn/tree/signals-v5 > > I'd like to resurrect it but it would be useful to know whether there > is any interest in a revamping the signals code. > > For those who may not want look through code based on a 2 years > old branch, with a longish commit _not_ yet split into bite sized pieces, > here is a gist: > > 1. In many places in the code, signal_received is directly assigned to: > always use register_signal() and/or provide similar abstractions for > setting signals > > 2. Low priority signals can currently write over high priority ones > like SIGTERM: prioritize signals and to avoid losing signals. Doing > this in a fool-proof way is hard, especially when we have many signal > sources (hard, soft, management). But moving to POSIX signals would > help (see #3) > > 3. signal() follows BSD semantics on some systems, system V semantics > on others (including solaris, AIX). On Linux its more complex: signal() > syscall > in the kernel apparently follows system V behaviour but probably is > not widely used: instead, glibc (> version 2) provides its own > implementation with BSD or system V personality based on feature set > macros. > Its much better move to POSIX sigaction() for consistency, for > properly blocking signals in the handler, to mask signals etc. sigaction() is > supported by all unix-like OSes we support. > > 4. Windows "signals" (SetEvent, ctrl-C, ctrl-break) are not picked up > during pre-init phase or while in openvpn_sleep() unless management > interface is active. Note that on unix sleep() is interrupted by signals > but on windows ctrl-C doesnt interrupt sleep(). Reports like "impossible to > break out of dns resolution loop on Windows when run from command > line" is related to this. Not hard to fix. > > That said, getaddrinfo() will continue to block and cannot be > interrupted by signals -- it never returns EINTR but continues trying > until an internal timeout expires. This is known issue and I know of > no fixes. But we can make sure that signal received during > getaddrinfo() is properly acted upon when it returns. Currently we > can't guarantee even that. This is important as that call can block for a > long time (15 seconds?) and is a very likely place to receive user interrupt. > > Finally, I never completed this mainly because the changes are not > "perfect": its only an attempt to tidy-up the existing signal > infrastructure. The use of signals is deeply entrenched in openvpn > code-base and is used for multiple purposes, has multiple origins > (hard, soft, management). And the code is hard to follow. Only > consolation is that there are no multiple threads to worry about. > > I definitely do not know how to replace it with something clean. So > comments and suggestions would help.
This approach sounds like a good step forward. So yes, please share :) (Even if responses might be slow...) -Steffan ------------------------------------------------------------------------------ Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot _______________________________________________ Openvpn-devel mailing list Openvpn-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/openvpn-devel