Re: bug: mworker unable to reload on USR2 since baf6ea4b

2017-12-24 Thread Willy Tarreau
Hi Lukas,

On Sun, Dec 24, 2017 at 05:52:20PM +0100, Lukas Tribus wrote:
> Hello,
> 
> 
> as per the report from sagaxu on discourse:
> https://discourse.haproxy.org/t/listen-socket-closed-after-reloading-by-sigusr2/1925
> 
> It appears master-worker reload (USR2 to the master process) is
> currently broken.
> 
> When sending USR2 to the master process, all sockets are closed and
> while a worker is forked successfully, it does not work as it will
> never have any sockets connected to it.
> 
> 
> The bug has been introduced in commit baf6ea4b (" BUG/MINOR: mworker:
> detach from tty when in daemon mode") and affects 1.8.1 and newer
> stable releases.
> The discourse-OP also states that the socket is likely closed by the
> triple flcose() introduced in that bisected commit.

Ah bad :-( But interestingly, we were speaking about using CLOEXEC on
listeners, I think this will make all of this much easier to deal with.

> I'm not sufficiently familiar with the mworker and expected
> forking/stdin/stdout/stderr behavior to be able to suggest a fix, but
> I can confirm the issue is easily reproducible post-baf6ea4b.

Thanks very much for the tests and analysis. It's probably time to drop
your keyboard for this evening :-)

> Best regards and merry xmas!

thanks and to you as well.
Willy



Re: bug: mworker unable to reload on USR2 since baf6ea4b

2017-12-25 Thread Willy Tarreau
On Sun, Dec 24, 2017 at 07:36:54PM +0100, Willy Tarreau wrote:
> > The bug has been introduced in commit baf6ea4b (" BUG/MINOR: mworker:
> > detach from tty when in daemon mode") and affects 1.8.1 and newer
> > stable releases.
> > The discourse-OP also states that the socket is likely closed by the
> > triple flcose() introduced in that bisected commit.
> 
> Ah bad :-( But interestingly, we were speaking about using CLOEXEC on
> listeners, I think this will make all of this much easier to deal with.

OK I think I see what's happening. The fclose() is performed on FILE*
and makes sense for this process as any further references to these
files will know they are closed. But upon execve(), these FDs are
unassigned, and a new set of FILE* will be assigned to existing FDs
0,1,2 which now map to listening sockets. By doing this fclose() after
the reload, we in fact close the listening socket we've just created.

We could very well imagine closing 0,1,2 immediately upon startup in
the new process after we detect a reload (I think we can detect this).
Another option I'd rather not consider would be not to close these files
again upon reload, but that would mean that any warning/alert/debugging
message risks to be sent to whatever FD lies there (possibly including
the pidfile). We could also imagine opening /dev/null on 0,1,2 instead
of just closing them but I suspect it would just hide the real problem.

By the way, during my strace, I noticed that we do not check for the
pidfd's validity prior to writing to it, leading to this when no pidfile
is specified :

  10:09:39.475829 write(4294967295, "3205\n", 5) = -1 EBADF (Bad file 
descriptor)

So we still have some hardening work to do in this area. I guess we
first want to clearly describe how we want each process to behave in
each case (mw+debug, mw+quiet, mw+daemon, debug, quiet, daemon). This
way we'll avoid pushing fixes which break other use cases ;-)

Regards,
Willy



Re: bug: mworker unable to reload on USR2 since baf6ea4b

2017-12-25 Thread PiBa-NL

Hi Lucas, William,

I've made a patch which 'i think' fixes the issue with fclose called 'to 
often?'.

Can you guys verify?

I hope it helps.. but then again, maybe not the right way .?.
(I really have to little experience with these kind of things..)


Op 25-12-2017 om 10:25 schreef Willy Tarreau:

On Sun, Dec 24, 2017 at 07:36:54PM +0100, Willy Tarreau wrote:

The bug has been introduced in commit baf6ea4b (" BUG/MINOR: mworker:
detach from tty when in daemon mode") and affects 1.8.1 and newer
stable releases.
The discourse-OP also states that the socket is likely closed by the
triple flcose() introduced in that bisected commit.

Ah bad :-( But interestingly, we were speaking about using CLOEXEC on
listeners, I think this will make all of this much easier to deal with.

I guess we
first want to clearly describe how we want each process to behave in
each case (mw+debug, mw+quiet, mw+daemon, debug, quiet, daemon). This
way we'll avoid pushing fixes which break other use cases ;-)

Regards,
Willy



And then below some 'ramblings' that also might not all make sense... :)

I've made the assumption that when running in daemon mode there should 
never be any output to the stdin/out/err files when the masterworker is 
re-executing itself, and those 'files' are already closed as the master 
process doesn't really change.. or does it.?.
Is this correct, or would it just introduce another bug.?. I'm not 
really sure how to properly check for used the fd's in child processes 
and what they are used for...


As for writing out what is expected for each case. I think this would be 
it, including current results..:


The short version would be like this:
     - warnings+errors
    debug - pollerinfo+warnings+errors+connections
    quiet - no output except errors
    daemon - no output after first startup
    master-worker - the same information as 'normal'? (also after a 
reload, including connections info??)



Below are the above options in various combinations and if they produce 
expected result..

mw:
    warning+error (OK)
after reload:
    warning+error (OK)

mw+quiet(in config):
 only shows FIRST-startup errors (OK i guess, or should stdout not 
be closed as another re-start could be done and would desire error to be 
shown again..?.)

after reload:
     (OK) (or should it 'keep open' the stdout so it can 
output config errors again ?)

after reload without quiet config option:
     (FAIL?)

mw+debug:
    polling info+warning+error  (currently doesn't show 
debugging/headers info of connections).. (FAIL or intentional?)

after reload:
    polling info+warning+error  (currently doesn't show 
debugging/headers info of connections) (?)


mw+daemon:
 shows first startup errors+warnings(OK)
after reload:
     (OK)

mw+daemon+quiet:
 shows first startup errors (OK)
after reload:
     (OK)

mw+daemon+debug:
 shows first warnings+errors (OK)
after reload:
     (OK)

debug:
    shows pollerinfo+error+warning+debugging/headers (OK)

quiet:
    shows startup errors (OK)

daemon:
    shows startup warnings+errors (OK)

daemon+debug:
    shows startup warnings+errors (OK)

daemon+quiet: shows first startup errors (OK)
    shows startup errors (OK)


So removing 'quiet' from the config does something unwanted for MW.
And another possible issue i found is that when MW is reloaded with USR2 
and the config contains a error, after that is corrected and another 
SIG2 is send then the -x is not used when launching the new worker.. 
Seems to work OK afterwards though.. Is this intentional.?



Anyhow, wish you all good Christmas day(s) and a happy new year.
Regards,
PiBa-NL / Pieter

From 49272b2c0bafb413f0fb89717f17c784c2947a4f Mon Sep 17 00:00:00 2001
From: PiBa-NL 
Date: Mon, 25 Dec 2017 21:03:31 +0100
Subject: [PATCH] BUG/MEDIUM: mworker: avoid closing stdin/stdout/stderr file
 descriptors multiple times in the same master process

This makes sure that a frontend socket that gets created after initialization 
wont be closed when the mworker gets re-executed.
---
 src/haproxy.c | 17 +
 1 file changed, 13 insertions(+), 4 deletions(-)

diff --git a/src/haproxy.c b/src/haproxy.c
index ffd7ea0..0666ad0 100644
--- a/src/haproxy.c
+++ b/src/haproxy.c
@@ -2579,10 +2579,19 @@ int main(int argc, char **argv)
signal_register_fct(SIGTTIN, sig_listen, SIGTTIN);
 
/* MODE_QUIET can inhibit alerts and warnings below this line */
-
-   if ((global.mode & MODE_QUIET) && !(global.mode & MODE_VERBOSE)) {
-   /* detach from the tty */
-   fclose(stdin); fclose(stdout); fclose(stderr);
+   
+   if (getenv("HAPROXY_MWORKER_REEXEC") != NULL) {
+   // either stdin/out/err are already closed or should stay as 
they are.
+   if ((global.mode & MODE_DAEMON)) {
+   // daemon mode re-executing, stdin/stdout/stderr are 
already closed so keep quiet
+   global.mode &= ~MODE_VERBOSE;
+   g

Re: bug: mworker unable to reload on USR2 since baf6ea4b

2017-12-27 Thread Lukas Tribus
Hello Pieter,


On Tue, Dec 26, 2017 at 1:08 AM, PiBa-NL  wrote:
> Hi Lucas, William,
>
> I've made a patch which 'i think' fixes the issue with fclose called 'to
> often?'.
> Can you guys verify?

I can confirm the patch fixes the issue reported; whether it does it
"the correct way" - I don't know ...

Maybe William can take a look?



Thanks,
Lukas



Re: bug: mworker unable to reload on USR2 since baf6ea4b

2017-12-27 Thread William Lallemand
On Wed, Dec 27, 2017 at 10:21:25PM +0100, Lukas Tribus wrote:
> Hello Pieter,
> 
> 
> On Tue, Dec 26, 2017 at 1:08 AM, PiBa-NL  wrote:
> > Hi Lucas, William,
> >
> > I've made a patch which 'i think' fixes the issue with fclose called 'to
> > often?'.
> > Can you guys verify?
> 
> I can confirm the patch fixes the issue reported; whether it does it
> "the correct way" - I don't know ...
> 
> Maybe William can take a look?
> 

Hi Guys,

The patch seems to fix the issue, but we don't want to reproduce the same
problem again and again, it won't fix everything in my opinion.

I wasn't enthusiastic about the idea of having the pipe of the master and the
poller using the fd 0, 1 and 2; it's confusing during the debugging.
And we don't want to be surprised by libraries trying to write on the FD 1 or 2
for debugging. (glibc or openssl for example)

I think that's better to open /dev/null and dup2 with 0, 1, 2 so we won't have 
any
surprise with an fprintf(stderr, "..  anywhere in the code.

What's difficult with this, is that we need to display the error after the
chroot for the setuid and setgid, and we don't have access to /dev/null after
this, maybe we could open /dev/null in the master, do the chroot/setuid/setgid
and then do the dup2 in the worker.

-- 
William Lallemand



Re: bug: mworker unable to reload on USR2 since baf6ea4b

2017-12-28 Thread William Lallemand
On Thu, Dec 28, 2017 at 12:54:27AM +0100, William Lallemand wrote:
> I think that's better to open /dev/null and dup2 with 0, 1, 2 so we won't 
> have any
> surprise with an fprintf(stderr, "..  anywhere in the code.
> 

Hi,

I made a patch which does exactly that, however I think we will do a cleanup of
how the quiet & verbose mode work for 1.9. It's a little bit messy, we should
never edit the mode.global flags outside of the configuration and command line
parsing.

Two patches attached, including Pieter's one which was reformated.

Regards,

-- 
William Lallemand
>From 74af850251e890bb1656c233bb3fd503eaea13bf Mon Sep 17 00:00:00 2001
From: PiBa-NL 
Date: Mon, 25 Dec 2017 21:03:31 +0100
Subject: [PATCH 1/2] BUG/MEDIUM: mworker: don't close stdio several time

This patch makes sure that a frontend socket that gets created after
initialization won't be closed when the master gets re-executed.

When used in daemon mode, the master-worker is closing the FDs 0, 1, 2
after the fork of the children.

When the master was reloading, those FDs were assigned again during the
parsing of the configuration (probably for some listeners), and the
workers were closing them thinking it was the stdio.

This patch must be backported to 1.8.
---
 src/haproxy.c | 15 ---
 1 file changed, 12 insertions(+), 3 deletions(-)

diff --git a/src/haproxy.c b/src/haproxy.c
index ffd7ea05e..f3970f7b6 100644
--- a/src/haproxy.c
+++ b/src/haproxy.c
@@ -2580,9 +2580,18 @@ int main(int argc, char **argv)
 
 	/* MODE_QUIET can inhibit alerts and warnings below this line */
 
-	if ((global.mode & MODE_QUIET) && !(global.mode & MODE_VERBOSE)) {
-		/* detach from the tty */
-		fclose(stdin); fclose(stdout); fclose(stderr);
+	if (getenv("HAPROXY_MWORKER_REEXEC") != NULL) {
+		// either stdin/out/err are already closed or should stay as they are.
+		if ((global.mode & MODE_DAEMON)) {
+			// daemon mode re-executing, stdin/stdout/stderr are already closed so keep quiet
+			global.mode &= ~MODE_VERBOSE;
+			global.mode |= MODE_QUIET; /* ensure that we won't say anything from now */
+		}
+	} else {
+		if ((global.mode & MODE_QUIET) && !(global.mode & MODE_VERBOSE)) {
+			/* detach from the tty */
+			fclose(stdin); fclose(stdout); fclose(stderr);
+		}
 	}
 
 	/* open log & pid files before the chroot */
-- 
2.13.6

>From 7c84c87eac8bf48e62bc8ee79cf6cec3ce33996d Mon Sep 17 00:00:00 2001
From: William Lallemand 
Date: Thu, 28 Dec 2017 16:09:36 +0100
Subject: [PATCH 2/2] MINOR: don't close stdio anymore

Closing the standard IO FDs (0,1,2) can be troublesome, especially in
the case of the master-worker.

Instead of closing those FDs, they are now pointing to /dev/null which
prevents sending debugging messages to the wrong FDs.

This patch could be backported in 1.8.
---
 src/haproxy.c | 43 ---
 1 file changed, 40 insertions(+), 3 deletions(-)

diff --git a/src/haproxy.c b/src/haproxy.c
index f3970f7b6..2926af3ff 100644
--- a/src/haproxy.c
+++ b/src/haproxy.c
@@ -889,6 +889,32 @@ static void dump(struct sig_handler *sh)
 	pool_gc(NULL);
 }
 
+/*
+ *  This function dup2 the stdio FDs (0,1,2) with , then closes 
+ *  If  < 0, it opens /dev/null and use it to dup
+ *
+ *  In the case of chrooting, you have to open /dev/null before the chroot, and
+ *  pass the  to this function
+ */
+static void stdio_quiet(int fd)
+{
+
+	if (fd < 0)
+		fd = open("/dev/null", O_RDWR, 0);
+
+	if (fd > -1) {
+		dup2(fd, 0);
+		dup2(fd, 1);
+		dup2(fd, 2);
+		close(fd);
+		return;
+	}
+
+	ha_alert("Cannot open /dev/null\n");
+	exit(EXIT_FAILURE);
+}
+
+
 /* This function check if cfg_cfgfiles containes directories.
  * If it find one, it add all the files (and only files) it containes
  * in cfg_cfgfiles in place of the directory (and remove the directory).
@@ -2590,7 +2616,7 @@ int main(int argc, char **argv)
 	} else {
 		if ((global.mode & MODE_QUIET) && !(global.mode & MODE_VERBOSE)) {
 			/* detach from the tty */
-			fclose(stdin); fclose(stdout); fclose(stderr);
+			stdio_quiet(-1);
 		}
 	}
 
@@ -2683,6 +2709,7 @@ int main(int argc, char **argv)
 		struct peers *curpeers;
 		int ret = 0;
 		int proc;
+		int devnullfd = -1;
 
 		children = calloc(global.nbproc, sizeof(int));
 		/*
@@ -2797,7 +2824,9 @@ int main(int argc, char **argv)
 if ((!(global.mode & MODE_QUIET) || (global.mode & MODE_VERBOSE)) &&
 	(global.mode & MODE_DAEMON)) {
 	/* detach from the tty, this is required to properly daemonize. */
-	fclose(stdin); fclose(stdout); fclose(stderr);
+	if ((getenv("HAPROXY_MWORKER_REEXEC") == NULL))
+		stdio_quiet(-1);
+
 	global.mode &= ~MODE_VERBOSE;
 	global.mode |= MODE_QUIET; /* ensure that we won't say anything from now */
 	setsid();
@@ -2816,6 +2845,14 @@ int main(int argc, char **argv)
 		/* child must never use the atexit function */
 		atexit_flag = 0;
 
+		if (!(global.mode & MODE_QUIET) || (global.mode & MODE_VERBOSE)) {
+			devnullfd = open("/dev/null", O_RDWR, 0);
+			

Re: bug: mworker unable to reload on USR2 since baf6ea4b

2017-12-29 Thread Willy Tarreau
Hi William,

On Thu, Dec 28, 2017 at 09:17:28PM +0100, William Lallemand wrote:
> On Thu, Dec 28, 2017 at 12:54:27AM +0100, William Lallemand wrote:
> > I think that's better to open /dev/null and dup2 with 0, 1, 2 so we won't 
> > have any
> > surprise with an fprintf(stderr, "..  anywhere in the code.
> > 
> 
> Hi,
> 
> I made a patch which does exactly that

In fact it still doesn't fclose() the streams, which worries me a little bit
for the long term, because eventhough any printf() will end up being written
into /dev/null, it's still preferable to mark the FILE* as being closed and
only then reopen 0,1,2 to ensure they cannot be abusively reused. It will
also remove some confusion in strace by avoiding seeing some spurious fcntl()
or write(2, foo, strlen(foo)) being sent there caused by alerts for example.

What I would do would simply be to add the following line to your actual
patch :

> +/*
> + *  This function dup2 the stdio FDs (0,1,2) with , then closes 
> + *  If  < 0, it opens /dev/null and use it to dup
> + *
> + *  In the case of chrooting, you have to open /dev/null before the chroot, 
> and
> + *  pass the  to this function
> + */
> +static void stdio_quiet(int fd)
> +{
> +
> + if (fd < 0)
> + fd = open("/dev/null", O_RDWR, 0);
> +
> + if (fd > -1) {

+   fclose(stdin); fclose(stdout); fclose(stderr);

> + dup2(fd, 0);
> + dup2(fd, 1);
> + dup2(fd, 2);
> + close(fd);
> + return;
> + }
> +
> + ha_alert("Cannot open /dev/null\n");
> + exit(EXIT_FAILURE);
> +}

Otherwise I'm reasonably confident that this should be enough to close
all pending issues related to the master-worker now.

Willy



Re: bug: mworker unable to reload on USR2 since baf6ea4b

2017-12-29 Thread William Lallemand
On Fri, Dec 29, 2017 at 10:46:40AM +0100, Willy Tarreau wrote:
> Hi William,
> 
> In fact it still doesn't fclose() the streams, which worries me a little bit
> for the long term, because eventhough any printf() will end up being written
> into /dev/null, it's still preferable to mark the FILE* as being closed and
> only then reopen 0,1,2 to ensure they cannot be abusively reused. It will
> also remove some confusion in strace by avoiding seeing some spurious fcntl()
> or write(2, foo, strlen(foo)) being sent there caused by alerts for example.
> [...]
> Otherwise I'm reasonably confident that this should be enough to close
> all pending issues related to the master-worker now.
> 
> Willy
> 

I agree, it's better to merge them with fclose()

-- 
William Lallemand



Re: bug: mworker unable to reload on USR2 since baf6ea4b

2017-12-29 Thread PiBa-NL

Hi William, Willy,

Op 29-12-2017 om 11:35 schreef William Lallemand:

On Fri, Dec 29, 2017 at 10:46:40AM +0100, Willy Tarreau wrote:

Hi William,

In fact it still doesn't fclose() the streams, which worries me a little bit
for the long term, because eventhough any printf() will end up being written
into /dev/null, it's still preferable to mark the FILE* as being closed and
only then reopen 0,1,2 to ensure they cannot be abusively reused. It will
also remove some confusion in strace by avoiding seeing some spurious fcntl()
or write(2, foo, strlen(foo)) being sent there caused by alerts for example.
[...]
Otherwise I'm reasonably confident that this should be enough to close
all pending issues related to the master-worker now.

Willy


I agree, it's better to merge them with fclose()


But shouldn't be needed, as i read dup2 will close them?

"int dup2(int oldfd, int newfd);" "closing/newfd/first if necessary" https://linux.die.net/man/2/dup2 Regards, 
PiBa-NL / Pieter




Re: bug: mworker unable to reload on USR2 since baf6ea4b

2017-12-29 Thread Willy Tarreau
Hi Pieter,

On Fri, Dec 29, 2017 at 04:19:21PM +0100, PiBa-NL wrote:
> Op 29-12-2017 om 11:35 schreef William Lallemand:
> > On Fri, Dec 29, 2017 at 10:46:40AM +0100, Willy Tarreau wrote:
> > > Hi William,
> > > 
> > > In fact it still doesn't fclose() the streams, which worries me a little 
> > > bit
> > > for the long term, because eventhough any printf() will end up being 
> > > written
> > > into /dev/null, it's still preferable to mark the FILE* as being closed 
> > > and
> > > only then reopen 0,1,2 to ensure they cannot be abusively reused. It will
> > > also remove some confusion in strace by avoiding seeing some spurious 
> > > fcntl()
> > > or write(2, foo, strlen(foo)) being sent there caused by alerts for 
> > > example.
> > > [...]
> > > Otherwise I'm reasonably confident that this should be enough to close
> > > all pending issues related to the master-worker now.
> > > 
> > > Willy
> > > 
> > I agree, it's better to merge them with fclose()
> > 
> But shouldn't be needed, as i read dup2 will close them?

No, here's the catch : dup2() works on file descriptors, while stdin,
stdout and stderr are FILE*, which are in fact a wrapper provided by
glibc to present a stream on top of a file descriptor. So in short,
your FILE struct looks more or less like this :

  struct FILE {
 int fd;
 char buffer[];
 other stuff;
  }

When you call fprintf(stderr, "foo"), what it does is to write "foo" to
a temporary buffer, then try to write the largest possible part of this
buffer's contents to fileno(stderr) which reports the fd number using
the write() syscall. If you just did a close() on fd #2, then an open
on /dev/null, fd #2 was effectively closed, but replaced by the new one
mapped to /dev/null. Then the next subsequent fprintf(stderr, "blah")
will do the same as above, and write(2, buffer, length). So the FILE
struct is still not closed by doing this. This is a common misconception
that is commonly encountered that close(0..2) is enough to cleanly close
the I/O streams, but it's not.

Performing an fclose() instead will close both the FILE and the fd. Then
we can reassign the fd to /dev/null be sure it's never assigned to anything
sensitive so that upon next reload it's totally safe.

Regards,
Willy



Re: bug: mworker unable to reload on USR2 since baf6ea4b

2017-12-29 Thread Willy Tarreau
On Fri, Dec 29, 2017 at 11:35:10AM +0100, William Lallemand wrote:
> On Fri, Dec 29, 2017 at 10:46:40AM +0100, Willy Tarreau wrote:
> > Hi William,
> > 
> > In fact it still doesn't fclose() the streams, which worries me a little bit
> > for the long term, because eventhough any printf() will end up being written
> > into /dev/null, it's still preferable to mark the FILE* as being closed and
> > only then reopen 0,1,2 to ensure they cannot be abusively reused. It will
> > also remove some confusion in strace by avoiding seeing some spurious 
> > fcntl()
> > or write(2, foo, strlen(foo)) being sent there caused by alerts for example.
> > [...]
> > Otherwise I'm reasonably confident that this should be enough to close
> > all pending issues related to the master-worker now.
> > 
> > Willy
> > 
> 
> I agree, it's better to merge them with fclose()

OK just done that now. I also added a test for fd > 2 before closing it,
otherwise if you're unlucky and fd #1 is assignd to your new fd for
example, you end up closing it again just after having duplicated it
:-)

Willy