Re: bug: mworker unable to reload on USR2 since baf6ea4b
Hi Lukas, On Sun, Dec 24, 2017 at 05:52:20PM +0100, Lukas Tribus wrote: > Hello, > > > as per the report from sagaxu on discourse: > https://discourse.haproxy.org/t/listen-socket-closed-after-reloading-by-sigusr2/1925 > > It appears master-worker reload (USR2 to the master process) is > currently broken. > > When sending USR2 to the master process, all sockets are closed and > while a worker is forked successfully, it does not work as it will > never have any sockets connected to it. > > > The bug has been introduced in commit baf6ea4b (" BUG/MINOR: mworker: > detach from tty when in daemon mode") and affects 1.8.1 and newer > stable releases. > The discourse-OP also states that the socket is likely closed by the > triple flcose() introduced in that bisected commit. Ah bad :-( But interestingly, we were speaking about using CLOEXEC on listeners, I think this will make all of this much easier to deal with. > I'm not sufficiently familiar with the mworker and expected > forking/stdin/stdout/stderr behavior to be able to suggest a fix, but > I can confirm the issue is easily reproducible post-baf6ea4b. Thanks very much for the tests and analysis. It's probably time to drop your keyboard for this evening :-) > Best regards and merry xmas! thanks and to you as well. Willy
Re: bug: mworker unable to reload on USR2 since baf6ea4b
On Sun, Dec 24, 2017 at 07:36:54PM +0100, Willy Tarreau wrote: > > The bug has been introduced in commit baf6ea4b (" BUG/MINOR: mworker: > > detach from tty when in daemon mode") and affects 1.8.1 and newer > > stable releases. > > The discourse-OP also states that the socket is likely closed by the > > triple flcose() introduced in that bisected commit. > > Ah bad :-( But interestingly, we were speaking about using CLOEXEC on > listeners, I think this will make all of this much easier to deal with. OK I think I see what's happening. The fclose() is performed on FILE* and makes sense for this process as any further references to these files will know they are closed. But upon execve(), these FDs are unassigned, and a new set of FILE* will be assigned to existing FDs 0,1,2 which now map to listening sockets. By doing this fclose() after the reload, we in fact close the listening socket we've just created. We could very well imagine closing 0,1,2 immediately upon startup in the new process after we detect a reload (I think we can detect this). Another option I'd rather not consider would be not to close these files again upon reload, but that would mean that any warning/alert/debugging message risks to be sent to whatever FD lies there (possibly including the pidfile). We could also imagine opening /dev/null on 0,1,2 instead of just closing them but I suspect it would just hide the real problem. By the way, during my strace, I noticed that we do not check for the pidfd's validity prior to writing to it, leading to this when no pidfile is specified : 10:09:39.475829 write(4294967295, "3205\n", 5) = -1 EBADF (Bad file descriptor) So we still have some hardening work to do in this area. I guess we first want to clearly describe how we want each process to behave in each case (mw+debug, mw+quiet, mw+daemon, debug, quiet, daemon). This way we'll avoid pushing fixes which break other use cases ;-) Regards, Willy
Re: bug: mworker unable to reload on USR2 since baf6ea4b
Hi Lucas, William, I've made a patch which 'i think' fixes the issue with fclose called 'to often?'. Can you guys verify? I hope it helps.. but then again, maybe not the right way .?. (I really have to little experience with these kind of things..) Op 25-12-2017 om 10:25 schreef Willy Tarreau: On Sun, Dec 24, 2017 at 07:36:54PM +0100, Willy Tarreau wrote: The bug has been introduced in commit baf6ea4b (" BUG/MINOR: mworker: detach from tty when in daemon mode") and affects 1.8.1 and newer stable releases. The discourse-OP also states that the socket is likely closed by the triple flcose() introduced in that bisected commit. Ah bad :-( But interestingly, we were speaking about using CLOEXEC on listeners, I think this will make all of this much easier to deal with. I guess we first want to clearly describe how we want each process to behave in each case (mw+debug, mw+quiet, mw+daemon, debug, quiet, daemon). This way we'll avoid pushing fixes which break other use cases ;-) Regards, Willy And then below some 'ramblings' that also might not all make sense... :) I've made the assumption that when running in daemon mode there should never be any output to the stdin/out/err files when the masterworker is re-executing itself, and those 'files' are already closed as the master process doesn't really change.. or does it.?. Is this correct, or would it just introduce another bug.?. I'm not really sure how to properly check for used the fd's in child processes and what they are used for... As for writing out what is expected for each case. I think this would be it, including current results..: The short version would be like this: - warnings+errors debug - pollerinfo+warnings+errors+connections quiet - no output except errors daemon - no output after first startup master-worker - the same information as 'normal'? (also after a reload, including connections info??) Below are the above options in various combinations and if they produce expected result.. mw: warning+error (OK) after reload: warning+error (OK) mw+quiet(in config): only shows FIRST-startup errors (OK i guess, or should stdout not be closed as another re-start could be done and would desire error to be shown again..?.) after reload: (OK) (or should it 'keep open' the stdout so it can output config errors again ?) after reload without quiet config option: (FAIL?) mw+debug: polling info+warning+error (currently doesn't show debugging/headers info of connections).. (FAIL or intentional?) after reload: polling info+warning+error (currently doesn't show debugging/headers info of connections) (?) mw+daemon: shows first startup errors+warnings(OK) after reload: (OK) mw+daemon+quiet: shows first startup errors (OK) after reload: (OK) mw+daemon+debug: shows first warnings+errors (OK) after reload: (OK) debug: shows pollerinfo+error+warning+debugging/headers (OK) quiet: shows startup errors (OK) daemon: shows startup warnings+errors (OK) daemon+debug: shows startup warnings+errors (OK) daemon+quiet: shows first startup errors (OK) shows startup errors (OK) So removing 'quiet' from the config does something unwanted for MW. And another possible issue i found is that when MW is reloaded with USR2 and the config contains a error, after that is corrected and another SIG2 is send then the -x is not used when launching the new worker.. Seems to work OK afterwards though.. Is this intentional.? Anyhow, wish you all good Christmas day(s) and a happy new year. Regards, PiBa-NL / Pieter From 49272b2c0bafb413f0fb89717f17c784c2947a4f Mon Sep 17 00:00:00 2001 From: PiBa-NL Date: Mon, 25 Dec 2017 21:03:31 +0100 Subject: [PATCH] BUG/MEDIUM: mworker: avoid closing stdin/stdout/stderr file descriptors multiple times in the same master process This makes sure that a frontend socket that gets created after initialization wont be closed when the mworker gets re-executed. --- src/haproxy.c | 17 + 1 file changed, 13 insertions(+), 4 deletions(-) diff --git a/src/haproxy.c b/src/haproxy.c index ffd7ea0..0666ad0 100644 --- a/src/haproxy.c +++ b/src/haproxy.c @@ -2579,10 +2579,19 @@ int main(int argc, char **argv) signal_register_fct(SIGTTIN, sig_listen, SIGTTIN); /* MODE_QUIET can inhibit alerts and warnings below this line */ - - if ((global.mode & MODE_QUIET) && !(global.mode & MODE_VERBOSE)) { - /* detach from the tty */ - fclose(stdin); fclose(stdout); fclose(stderr); + + if (getenv("HAPROXY_MWORKER_REEXEC") != NULL) { + // either stdin/out/err are already closed or should stay as they are. + if ((global.mode & MODE_DAEMON)) { + // daemon mode re-executing, stdin/stdout/stderr are already closed so keep quiet + global.mode &= ~MODE_VERBOSE; + g
Re: bug: mworker unable to reload on USR2 since baf6ea4b
Hello Pieter, On Tue, Dec 26, 2017 at 1:08 AM, PiBa-NL wrote: > Hi Lucas, William, > > I've made a patch which 'i think' fixes the issue with fclose called 'to > often?'. > Can you guys verify? I can confirm the patch fixes the issue reported; whether it does it "the correct way" - I don't know ... Maybe William can take a look? Thanks, Lukas
Re: bug: mworker unable to reload on USR2 since baf6ea4b
On Wed, Dec 27, 2017 at 10:21:25PM +0100, Lukas Tribus wrote: > Hello Pieter, > > > On Tue, Dec 26, 2017 at 1:08 AM, PiBa-NL wrote: > > Hi Lucas, William, > > > > I've made a patch which 'i think' fixes the issue with fclose called 'to > > often?'. > > Can you guys verify? > > I can confirm the patch fixes the issue reported; whether it does it > "the correct way" - I don't know ... > > Maybe William can take a look? > Hi Guys, The patch seems to fix the issue, but we don't want to reproduce the same problem again and again, it won't fix everything in my opinion. I wasn't enthusiastic about the idea of having the pipe of the master and the poller using the fd 0, 1 and 2; it's confusing during the debugging. And we don't want to be surprised by libraries trying to write on the FD 1 or 2 for debugging. (glibc or openssl for example) I think that's better to open /dev/null and dup2 with 0, 1, 2 so we won't have any surprise with an fprintf(stderr, ".. anywhere in the code. What's difficult with this, is that we need to display the error after the chroot for the setuid and setgid, and we don't have access to /dev/null after this, maybe we could open /dev/null in the master, do the chroot/setuid/setgid and then do the dup2 in the worker. -- William Lallemand
Re: bug: mworker unable to reload on USR2 since baf6ea4b
On Thu, Dec 28, 2017 at 12:54:27AM +0100, William Lallemand wrote: > I think that's better to open /dev/null and dup2 with 0, 1, 2 so we won't > have any > surprise with an fprintf(stderr, ".. anywhere in the code. > Hi, I made a patch which does exactly that, however I think we will do a cleanup of how the quiet & verbose mode work for 1.9. It's a little bit messy, we should never edit the mode.global flags outside of the configuration and command line parsing. Two patches attached, including Pieter's one which was reformated. Regards, -- William Lallemand >From 74af850251e890bb1656c233bb3fd503eaea13bf Mon Sep 17 00:00:00 2001 From: PiBa-NL Date: Mon, 25 Dec 2017 21:03:31 +0100 Subject: [PATCH 1/2] BUG/MEDIUM: mworker: don't close stdio several time This patch makes sure that a frontend socket that gets created after initialization won't be closed when the master gets re-executed. When used in daemon mode, the master-worker is closing the FDs 0, 1, 2 after the fork of the children. When the master was reloading, those FDs were assigned again during the parsing of the configuration (probably for some listeners), and the workers were closing them thinking it was the stdio. This patch must be backported to 1.8. --- src/haproxy.c | 15 --- 1 file changed, 12 insertions(+), 3 deletions(-) diff --git a/src/haproxy.c b/src/haproxy.c index ffd7ea05e..f3970f7b6 100644 --- a/src/haproxy.c +++ b/src/haproxy.c @@ -2580,9 +2580,18 @@ int main(int argc, char **argv) /* MODE_QUIET can inhibit alerts and warnings below this line */ - if ((global.mode & MODE_QUIET) && !(global.mode & MODE_VERBOSE)) { - /* detach from the tty */ - fclose(stdin); fclose(stdout); fclose(stderr); + if (getenv("HAPROXY_MWORKER_REEXEC") != NULL) { + // either stdin/out/err are already closed or should stay as they are. + if ((global.mode & MODE_DAEMON)) { + // daemon mode re-executing, stdin/stdout/stderr are already closed so keep quiet + global.mode &= ~MODE_VERBOSE; + global.mode |= MODE_QUIET; /* ensure that we won't say anything from now */ + } + } else { + if ((global.mode & MODE_QUIET) && !(global.mode & MODE_VERBOSE)) { + /* detach from the tty */ + fclose(stdin); fclose(stdout); fclose(stderr); + } } /* open log & pid files before the chroot */ -- 2.13.6 >From 7c84c87eac8bf48e62bc8ee79cf6cec3ce33996d Mon Sep 17 00:00:00 2001 From: William Lallemand Date: Thu, 28 Dec 2017 16:09:36 +0100 Subject: [PATCH 2/2] MINOR: don't close stdio anymore Closing the standard IO FDs (0,1,2) can be troublesome, especially in the case of the master-worker. Instead of closing those FDs, they are now pointing to /dev/null which prevents sending debugging messages to the wrong FDs. This patch could be backported in 1.8. --- src/haproxy.c | 43 --- 1 file changed, 40 insertions(+), 3 deletions(-) diff --git a/src/haproxy.c b/src/haproxy.c index f3970f7b6..2926af3ff 100644 --- a/src/haproxy.c +++ b/src/haproxy.c @@ -889,6 +889,32 @@ static void dump(struct sig_handler *sh) pool_gc(NULL); } +/* + * This function dup2 the stdio FDs (0,1,2) with , then closes + * If < 0, it opens /dev/null and use it to dup + * + * In the case of chrooting, you have to open /dev/null before the chroot, and + * pass the to this function + */ +static void stdio_quiet(int fd) +{ + + if (fd < 0) + fd = open("/dev/null", O_RDWR, 0); + + if (fd > -1) { + dup2(fd, 0); + dup2(fd, 1); + dup2(fd, 2); + close(fd); + return; + } + + ha_alert("Cannot open /dev/null\n"); + exit(EXIT_FAILURE); +} + + /* This function check if cfg_cfgfiles containes directories. * If it find one, it add all the files (and only files) it containes * in cfg_cfgfiles in place of the directory (and remove the directory). @@ -2590,7 +2616,7 @@ int main(int argc, char **argv) } else { if ((global.mode & MODE_QUIET) && !(global.mode & MODE_VERBOSE)) { /* detach from the tty */ - fclose(stdin); fclose(stdout); fclose(stderr); + stdio_quiet(-1); } } @@ -2683,6 +2709,7 @@ int main(int argc, char **argv) struct peers *curpeers; int ret = 0; int proc; + int devnullfd = -1; children = calloc(global.nbproc, sizeof(int)); /* @@ -2797,7 +2824,9 @@ int main(int argc, char **argv) if ((!(global.mode & MODE_QUIET) || (global.mode & MODE_VERBOSE)) && (global.mode & MODE_DAEMON)) { /* detach from the tty, this is required to properly daemonize. */ - fclose(stdin); fclose(stdout); fclose(stderr); + if ((getenv("HAPROXY_MWORKER_REEXEC") == NULL)) + stdio_quiet(-1); + global.mode &= ~MODE_VERBOSE; global.mode |= MODE_QUIET; /* ensure that we won't say anything from now */ setsid(); @@ -2816,6 +2845,14 @@ int main(int argc, char **argv) /* child must never use the atexit function */ atexit_flag = 0; + if (!(global.mode & MODE_QUIET) || (global.mode & MODE_VERBOSE)) { + devnullfd = open("/dev/null", O_RDWR, 0); +
Re: bug: mworker unable to reload on USR2 since baf6ea4b
Hi William, On Thu, Dec 28, 2017 at 09:17:28PM +0100, William Lallemand wrote: > On Thu, Dec 28, 2017 at 12:54:27AM +0100, William Lallemand wrote: > > I think that's better to open /dev/null and dup2 with 0, 1, 2 so we won't > > have any > > surprise with an fprintf(stderr, ".. anywhere in the code. > > > > Hi, > > I made a patch which does exactly that In fact it still doesn't fclose() the streams, which worries me a little bit for the long term, because eventhough any printf() will end up being written into /dev/null, it's still preferable to mark the FILE* as being closed and only then reopen 0,1,2 to ensure they cannot be abusively reused. It will also remove some confusion in strace by avoiding seeing some spurious fcntl() or write(2, foo, strlen(foo)) being sent there caused by alerts for example. What I would do would simply be to add the following line to your actual patch : > +/* > + * This function dup2 the stdio FDs (0,1,2) with , then closes > + * If < 0, it opens /dev/null and use it to dup > + * > + * In the case of chrooting, you have to open /dev/null before the chroot, > and > + * pass the to this function > + */ > +static void stdio_quiet(int fd) > +{ > + > + if (fd < 0) > + fd = open("/dev/null", O_RDWR, 0); > + > + if (fd > -1) { + fclose(stdin); fclose(stdout); fclose(stderr); > + dup2(fd, 0); > + dup2(fd, 1); > + dup2(fd, 2); > + close(fd); > + return; > + } > + > + ha_alert("Cannot open /dev/null\n"); > + exit(EXIT_FAILURE); > +} Otherwise I'm reasonably confident that this should be enough to close all pending issues related to the master-worker now. Willy
Re: bug: mworker unable to reload on USR2 since baf6ea4b
On Fri, Dec 29, 2017 at 10:46:40AM +0100, Willy Tarreau wrote: > Hi William, > > In fact it still doesn't fclose() the streams, which worries me a little bit > for the long term, because eventhough any printf() will end up being written > into /dev/null, it's still preferable to mark the FILE* as being closed and > only then reopen 0,1,2 to ensure they cannot be abusively reused. It will > also remove some confusion in strace by avoiding seeing some spurious fcntl() > or write(2, foo, strlen(foo)) being sent there caused by alerts for example. > [...] > Otherwise I'm reasonably confident that this should be enough to close > all pending issues related to the master-worker now. > > Willy > I agree, it's better to merge them with fclose() -- William Lallemand
Re: bug: mworker unable to reload on USR2 since baf6ea4b
Hi William, Willy, Op 29-12-2017 om 11:35 schreef William Lallemand: On Fri, Dec 29, 2017 at 10:46:40AM +0100, Willy Tarreau wrote: Hi William, In fact it still doesn't fclose() the streams, which worries me a little bit for the long term, because eventhough any printf() will end up being written into /dev/null, it's still preferable to mark the FILE* as being closed and only then reopen 0,1,2 to ensure they cannot be abusively reused. It will also remove some confusion in strace by avoiding seeing some spurious fcntl() or write(2, foo, strlen(foo)) being sent there caused by alerts for example. [...] Otherwise I'm reasonably confident that this should be enough to close all pending issues related to the master-worker now. Willy I agree, it's better to merge them with fclose() But shouldn't be needed, as i read dup2 will close them? "int dup2(int oldfd, int newfd);" "closing/newfd/first if necessary" https://linux.die.net/man/2/dup2 Regards, PiBa-NL / Pieter
Re: bug: mworker unable to reload on USR2 since baf6ea4b
Hi Pieter, On Fri, Dec 29, 2017 at 04:19:21PM +0100, PiBa-NL wrote: > Op 29-12-2017 om 11:35 schreef William Lallemand: > > On Fri, Dec 29, 2017 at 10:46:40AM +0100, Willy Tarreau wrote: > > > Hi William, > > > > > > In fact it still doesn't fclose() the streams, which worries me a little > > > bit > > > for the long term, because eventhough any printf() will end up being > > > written > > > into /dev/null, it's still preferable to mark the FILE* as being closed > > > and > > > only then reopen 0,1,2 to ensure they cannot be abusively reused. It will > > > also remove some confusion in strace by avoiding seeing some spurious > > > fcntl() > > > or write(2, foo, strlen(foo)) being sent there caused by alerts for > > > example. > > > [...] > > > Otherwise I'm reasonably confident that this should be enough to close > > > all pending issues related to the master-worker now. > > > > > > Willy > > > > > I agree, it's better to merge them with fclose() > > > But shouldn't be needed, as i read dup2 will close them? No, here's the catch : dup2() works on file descriptors, while stdin, stdout and stderr are FILE*, which are in fact a wrapper provided by glibc to present a stream on top of a file descriptor. So in short, your FILE struct looks more or less like this : struct FILE { int fd; char buffer[]; other stuff; } When you call fprintf(stderr, "foo"), what it does is to write "foo" to a temporary buffer, then try to write the largest possible part of this buffer's contents to fileno(stderr) which reports the fd number using the write() syscall. If you just did a close() on fd #2, then an open on /dev/null, fd #2 was effectively closed, but replaced by the new one mapped to /dev/null. Then the next subsequent fprintf(stderr, "blah") will do the same as above, and write(2, buffer, length). So the FILE struct is still not closed by doing this. This is a common misconception that is commonly encountered that close(0..2) is enough to cleanly close the I/O streams, but it's not. Performing an fclose() instead will close both the FILE and the fd. Then we can reassign the fd to /dev/null be sure it's never assigned to anything sensitive so that upon next reload it's totally safe. Regards, Willy
Re: bug: mworker unable to reload on USR2 since baf6ea4b
On Fri, Dec 29, 2017 at 11:35:10AM +0100, William Lallemand wrote: > On Fri, Dec 29, 2017 at 10:46:40AM +0100, Willy Tarreau wrote: > > Hi William, > > > > In fact it still doesn't fclose() the streams, which worries me a little bit > > for the long term, because eventhough any printf() will end up being written > > into /dev/null, it's still preferable to mark the FILE* as being closed and > > only then reopen 0,1,2 to ensure they cannot be abusively reused. It will > > also remove some confusion in strace by avoiding seeing some spurious > > fcntl() > > or write(2, foo, strlen(foo)) being sent there caused by alerts for example. > > [...] > > Otherwise I'm reasonably confident that this should be enough to close > > all pending issues related to the master-worker now. > > > > Willy > > > > I agree, it's better to merge them with fclose() OK just done that now. I also added a test for fd > 2 before closing it, otherwise if you're unlucky and fd #1 is assignd to your new fd for example, you end up closing it again just after having duplicated it :-) Willy