Re: [PATCH 6/9] MEDIUM: mworker: workers exit when the master leaves
On Tue, May 30, 2017 at 12:39:32PM +0200, Willy Tarreau wrote: > [...] > > The master, not intercepting this signal, would die, closing the pipe. > The worker would be woken up on the detection of this closure, and while > trying to perform the read() would get the signal in turn, causing the > read() to return EINTR and to stop polling on this fd instead of exiting. > You are absolutely right, I'll fix that. > > Regarding the environment variable names, it's preferable to prepend > "HAPROXY" in front of them as we've been doing for all other ones to > avoid namespace conflicts with anything used in other environments (I've > ween places where you had to run "env|grep" to find your variable). I've > seen another one called "WAIT_ONLY" in one of the first patches and > which should equally be renamed. > Will do. > Otherwise the series looks quite good, it would be nice to get some > feedback especially from systemd hostages^Wusers (my comments above > will not affect their experience unless they're really unlucky). > > Thanks! > Willy > I'll send you another batch with the fixes, and maybe some cleanup. In the meanwhile people can still try them. -- William Lallemand
Re: [PATCH 6/9] MEDIUM: mworker: workers exit when the master leaves
So all the series looks quite good, and I must confess I'm impatient to merge it so that we turn the page of the wrapper, and also because being able to use nbproc in foreground during development can be nice. But I have two comments first : On Mon, May 29, 2017 at 05:42:09PM +0200, William Lallemand wrote: > +void mworker_pipe_handler(int fd) > +{ > + char c; > + > + if (read(fd, , 1) > -1) { > + fd_delete(fd); > + deinit(); > + exit(EXIT_FAILURE); > + } else { > + /* should never happened */ > + fd_delete(fd); > + } > + > + return; > +} The test above is dangerous, it assumes that there's zero valid reason for read() to return -1 but in fact there are eventhough they are corner cases. First, a good rule of thumb to consider is that a poller is not rocket science and that it always relies on the callee to check for the reality of event readiness. We have one well-known example in Linux in association with UDP datagram reception. It's possible for poll() to say "there's something to read" and when read()/recv() try to read, the packet checksum is computed on the fly during the user_copy(), found to be bad, the packet is destroyed and read() returns EAGAIN. Of course it's not what you have above, but it's to say that whenever you have a poller, EAGAIN must be tested to be completely safe. The second (more likely) case is a signal causing read() to return -1 EINTR. And in fact in the master worker, it could theorically happen like this : killall -SOMESIG haproxy The master, not intercepting this signal, would die, closing the pipe. The worker would be woken up on the detection of this closure, and while trying to perform the read() would get the signal in turn, causing the read() to return EINTR and to stop polling on this fd instead of exiting. A much safer way to deal with this situation is to loop on EINTR and return without doing anything on EAGAIN, approximately like this : while (read(fd) == -1) { if (errno == EINTR) continue; if (errno == EAGAIN) { fd_cant_recv(fd); return; } /* otherwise probably die ? */ break; } deinit(); exit(); > @@ -2478,6 +2508,28 @@ int main(int argc, char **argv) > exit(0); > } > > + if (global.mode & MODE_MWORKER) { > + if ((getenv(REEXEC_FLAG) == NULL)) { > + char *msg = NULL; > + /* master pipe to ensure the master is still > alive */ > + ret = pipe(mworker_pipe); > + if (ret < 0) { > + Warning("[%s.main()] Cannot create > master pipe.\n", argv[0]); > + } else { > + memprintf(, "%d", mworker_pipe[0]); > + setenv("MWORKER_PIPE_RD", msg, 1); > + memprintf(, "%d", mworker_pipe[1]); > + setenv("MWORKER_PIPE_WR", msg, 1); Regarding the environment variable names, it's preferable to prepend "HAPROXY" in front of them as we've been doing for all other ones to avoid namespace conflicts with anything used in other environments (I've ween places where you had to run "env|grep" to find your variable). I've seen another one called "WAIT_ONLY" in one of the first patches and which should equally be renamed. Otherwise the series looks quite good, it would be nice to get some feedback especially from systemd hostages^Wusers (my comments above will not affect their experience unless they're really unlucky). Thanks! Willy
[PATCH 6/9] MEDIUM: mworker: workers exit when the master leaves
This patch ensure that the children will exit when the master quits, even if the master didn't send any signal. The master and the workers are connected through a pipe, when the pipe closes the children leave. --- src/haproxy.c | 55 +++ 1 file changed, 55 insertions(+) diff --git a/src/haproxy.c b/src/haproxy.c index d23bf3a..01681d4 100644 --- a/src/haproxy.c +++ b/src/haproxy.c @@ -205,6 +205,8 @@ int *children = NULL; /* store PIDs of children in master workers mode */ static volatile sig_atomic_t caught_signal = 0; static char **next_argv = NULL; +int mworker_pipe[2]; + /* list of the temporarily limited listeners because of lack of resource */ struct list global_listener_queue = LIST_HEAD_INIT(global_listener_queue); struct task *global_listener_queue_task; @@ -2227,6 +2229,34 @@ static struct task *manage_global_listener_queue(struct task *t) return t; } +void mworker_pipe_handler(int fd) +{ + char c; + + if (read(fd, , 1) > -1) { + fd_delete(fd); + deinit(); + exit(EXIT_FAILURE); + } else { + /* should never happened */ + fd_delete(fd); + } + + return; +} + +void mworker_pipe_register(int pipefd[2]) +{ + close(mworker_pipe[1]); /* close the write end of the master pipe in the children */ + + fcntl(mworker_pipe[0], F_SETFL, O_NONBLOCK); + fdtab[mworker_pipe[0]].owner = mworker_pipe; + fdtab[mworker_pipe[0]].iocb = mworker_pipe_handler; + fd_insert(mworker_pipe[0]); + fd_want_recv(mworker_pipe[0]); + } + + int main(int argc, char **argv) { int err, retry; @@ -2478,6 +2508,28 @@ int main(int argc, char **argv) exit(0); } + if (global.mode & MODE_MWORKER) { + if ((getenv(REEXEC_FLAG) == NULL)) { + char *msg = NULL; + /* master pipe to ensure the master is still alive */ + ret = pipe(mworker_pipe); + if (ret < 0) { + Warning("[%s.main()] Cannot create master pipe.\n", argv[0]); + } else { + memprintf(, "%d", mworker_pipe[0]); + setenv("MWORKER_PIPE_RD", msg, 1); + memprintf(, "%d", mworker_pipe[1]); + setenv("MWORKER_PIPE_WR", msg, 1); + free(msg); + } + } else { + mworker_pipe[0] = atol(getenv("MWORKER_PIPE_RD")); + mworker_pipe[1] = atol(getenv("MWORKER_PIPE_WR")); + if (mworker_pipe[0] <= 0 || mworker_pipe[1] <= 0) { + Warning("[%s.main()] Cannot get master pipe FDs.\n", argv[0]); + } + } + } /* the father launches the required number of processes */ for (proc = 0; proc < global.nbproc; proc++) { @@ -2637,6 +2689,9 @@ int main(int argc, char **argv) if (!dns_init_resolvers(1)) exit(1); + if (global.mode & MODE_MWORKER) + mworker_pipe_register(mworker_pipe); + protocol_enable_all(); /* * That's it : the central polling loop. Run until we stop. -- 2.10.2