On Mon, Nov 20, 2023 at 12:40:58PM -0500, Ihar Hrachyshka wrote: > On Mon, Nov 20, 2023 at 11:22 AM Simon Horman <ho...@ovn.org> wrote: > > > On Sat, Nov 18, 2023 at 01:07:02AM +0000, Ihar Hrachyshka wrote: > > > In nonblocking mode, a POSIX socket may temporarily fail to connect(), > > > which is indicated by returning EINPROGRESS. This patch will loop > > > waiting for connect() to succeed, or until EINPROGRESS happens. > > > > > > An alternative - and probably a better - path to deal with this error > > > would be to extend unix stream_class to support connect API, that would > > > allow the caller to repeat connect() attempts asynchronously. > > > > Hi Ihar, > > > > Yes, that probably would be better. But I expect it is also more complex. > > So perhaps such work is appropriate as a follow-up at some point. Because, > > assuming the patch is correct (I have some questions), it does seem to > > improve things. And I'm all for improving things even if incrementally. > > > > This patch is clearly controversial (and I hate it), thanks for starting > this discussion. (Perhaps I should have posted this patch from the series > as RFC not to suggest I am happy about it. I am not.) > > > > > > > Signed-off-by: Ihar Hrachyshka <ihrac...@redhat.com> > > > --- > > > lib/socket-util-unix.c | 14 ++++++++++---- > > > 1 file changed, 10 insertions(+), 4 deletions(-) > > > > > > diff --git a/lib/socket-util-unix.c b/lib/socket-util-unix.c > > > index 0053a61b1..230705ba3 100644 > > > --- a/lib/socket-util-unix.c > > > +++ b/lib/socket-util-unix.c > > > @@ -361,10 +361,16 @@ make_unix_socket(int style, bool nonblock, > > > int dirfd; > > > > > > error = make_sockaddr_un(connect_path, &un, &un_len, &dirfd, > > linkname); > > > - if (!error > > > - && connect(fd, (struct sockaddr*) &un, un_len) > > > - && errno != EINPROGRESS) { > > > - error = errno; > > > + if (!error) { > > > + for (;;) { > > > + if (!connect(fd, (struct sockaddr *)&un, un_len)) { > > > + break; > > > + } > > > + if (errno != EINPROGRESS) { > > > + error = errno; > > > + break; > > > + } > > > + } > > > > I have a few questions about this. > > > > 1. I see that patch 4/4 also takes into account EAGAIN as an > > alternative for EINPROGRESS, which is consistent with my > > reading of connect(2) wrt Unix domain sockets on Linux. > > And, reading connect(3p) I see that EINPROGRESS is the > > correct error to check for wrt POSIX. > > > > But I am not entirely clear that calling connect() again > > is the correct behaviour. Is this clear to you? > > > > It's not. Perhaps there are bugs in the implementation beyond non-graceful > handling of EAGAIN for Unix sockets. > > For example, AFAIU, POSIX implies that if connect() returns EINPROGRESS, > then the code should select() to check if FD is ready for write. > (Alternatively, consequent calls to connect() may return EALREADY.) > > This particular snippet from the Linux connect 2 man page describes the > proper way to deal with EINPROGRESS: > https://man7.org/linux/man-pages/man2/connect.2.html > > ``` > After > select(2) indicates writability, use getsockopt(2) to read > the SO_ERROR option at level SOL_SOCKET to determine > whether connect() completed successfully (SO_ERROR is > zero) or unsuccessfully (SO_ERROR is one of the usual > error codes listed here, explaining the reason for the > failure). > ``` > > I don't see any references to SO_ERROR in the openvswitch tree. > > Then there's the Linux case for AF_INET sockets that returns EAGAIN instead > of EINPROGRESS. In this case, I think consequent connect() calls may > require different treatment (repeating connect attempts). What I didn't > consider in this patch, I think, is that this EAGAIN behavior is Linux > specific, and - probably - other Unixes may decide to return EINPROGRESS in > the same case. > > --- > > Is it safe to call connect() after EAGAIN? Depends on the definition of > "safe". I think it's "safe" to do it with the current implementation of > AF_UNIX stream_connect in kernel, see: > https://github.com/torvalds/linux/blob/98b1cc82c4affc16f5598d4fa14b1858671b2263/net/unix/af_unix.c#L1559 > > Here, once the connect() call times out and bails out to "goto out" with > errno set to EAGAIN, it unix_release_sock(newsk, 0); the allocated > structures for the socket. > > Does it mean we can safely rely on this behavior not to change in Linux AND > be consistent with any other systems that may decide to EAGAIN on AF_UNIX > sockets? I dunno. :(
I think we are on the same page here :( > > 2. Is it possible that this loop never exits? > > Or is otherwise abusive resource-wise. > > > > Abuse could be dealt with usleep() on each iteration (perhaps with a > backoff). The infinite loop situation would require a mechanism to timeout > after a particular number of attempts. We could definitely implement this. > But - perhaps at this point it would be better to implement the "connect" > unix class endpoint. I will take a look into it in the next revision. Yes, that is a good point. Perhaps rather than bandage things up, in ways that we are unclear of the consequences, a new approach is worth investigating. > > --- > > I suggest we do the following for the next revision. > > 1. In unix_make_connect, call connect() once. If it returns 0, tag the > stream connected (stream->state). > 2. If connect() returns EAGAIN or EINPROGRESS, bail out from > unix_make_connect but return fd. Otherwise return error. > 3. Implement connect() endpoint to unconditionally call connect(). If it > returns EAGAIN (on Linux) or EINPROGRESS (POSIX), consider the stream is > still connecting. If it's a different error, tag the stream disconnected. > > AFAIR (1) is already done. (2) will need to know that EAGAIN is a legit > answer from connect(), but no loop will be introduced. (3) will need to be > written from scratch. I think check_connection_completion may also need to > be adjusted to return EAGAIN when Linux AF_UNIX socket is still connecting. 1. Agreed 2. That is my reading of connect(2) [*] 3. I'm a little clear how the "still connecting case" will be handled [*] https://www.man7.org/linux/man-pages/man2/connect.2.html _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev