Quoting Dwight Engen (dwight.en...@oracle.com): > On Tue, 25 Mar 2014 15:50:06 -0500 > Serge Hallyn <serge.hal...@ubuntu.com> wrote: > > > If we start a lxc_wait on a container while it is exiting, it is > > possible that we open the command socket, then the monitor closes > > all its mainloop sockets and exit, then we send our credentials. > > Then we get killed by SIGPIPE. > > Hey Serge, this is an interesting condition. I'm a bit confused where > the race is, looking at lxc_wait it looks like we get the monitor open
Not the monitor, the command socket /var/lib/lxc/container/command. These are handled by the container mainloop alongside consoles etc. When the container exits, the mainloop just closes all the open sockets and exits. lxc_getstate opens the socket, which triggers lxc_cmd_accept() to accept and add the accepted socket to it's epoll list; but it doesn't wait until those are closed by the remote end to exit. > before opening the command socket with lxc_getstate, and the monitor > shouldn't exit while we are still a client of it. > > > Handle that case, recognizing that if we get sigpipe then the > > container is (now) stopped. > > This makes sense in general, so I agree with the change. > > > Signed-off-by: Serge Hallyn <serge.hal...@ubuntu.com> > > --- > > src/lxc/af_unix.c | 4 ++-- > > src/lxc/commands.c | 11 ++++++++++- > > 2 files changed, 12 insertions(+), 3 deletions(-) > > > > diff --git a/src/lxc/af_unix.c b/src/lxc/af_unix.c > > index a2de73e..46d8e50 100644 > > --- a/src/lxc/af_unix.c > > +++ b/src/lxc/af_unix.c > > @@ -158,7 +158,7 @@ int lxc_abstract_unix_send_fd(int fd, int sendfd, > > void *data, size_t size) msg.msg_iov = &iov; > > msg.msg_iovlen = 1; > > > > - return sendmsg(fd, &msg, 0); > > + return sendmsg(fd, &msg, MSG_NOSIGNAL); > > } > > > > int lxc_abstract_unix_recv_fd(int fd, int *recvfd, void *data, > > size_t size) @@ -230,7 +230,7 @@ int > > lxc_abstract_unix_send_credential(int fd, void *data, size_t size) > > msg.msg_iov = &iov; msg.msg_iovlen = 1; > > > > - return sendmsg(fd, &msg, 0); > > + return sendmsg(fd, &msg, MSG_NOSIGNAL); > > } > > > > int lxc_abstract_unix_rcv_credential(int fd, void *data, size_t size) > > diff --git a/src/lxc/commands.c b/src/lxc/commands.c > > index 6b46c2c..b71274c 100644 > > --- a/src/lxc/commands.c > > +++ b/src/lxc/commands.c > > @@ -263,6 +263,8 @@ static int lxc_cmd(const char *name, struct > > lxc_cmd_rr *cmd, int *stopped, > > ret = lxc_abstract_unix_send_credential(sock, &cmd->req, > > sizeof(cmd->req)); if (ret != sizeof(cmd->req)) { > > + if (errno == EPIPE) > > + goto epipe; > > SYSERROR("command %s failed to send req to '@%s' %d", > > lxc_cmd_str(cmd->req.cmd), offset, ret); > > if (ret >=0) > > @@ -271,8 +273,10 @@ static int lxc_cmd(const char *name, struct > > lxc_cmd_rr *cmd, int *stopped, } > > > > if (cmd->req.datalen > 0) { > > - ret = send(sock, cmd->req.data, cmd->req.datalen, 0); > > + ret = send(sock, cmd->req.data, cmd->req.datalen, > > MSG_NOSIGNAL); if (ret != cmd->req.datalen) { > > + if (errno == EPIPE) > > + goto epipe; > > SYSERROR("command %s failed to send request > > data to '@%s' %d", lxc_cmd_str(cmd->req.cmd), offset, ret); > > if (ret >=0) > > @@ -289,6 +293,11 @@ out: > > cmd->rsp.ret = sock; > > > > return ret; > > + > > +epipe: > > + close(sock); > > + *stopped = 1; > > + return 0; > > } > > > > int lxc_try_cmd(const char *name, const char *lxcpath) > _______________________________________________ lxc-devel mailing list lxc-devel@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-devel