On Mon, Nov 20, 2023 at 12:40:58PM -0500, Ihar Hrachyshka wrote:
> On Mon, Nov 20, 2023 at 11:22 AM Simon Horman <ho...@ovn.org> wrote:
> 
> > On Sat, Nov 18, 2023 at 01:07:02AM +0000, Ihar Hrachyshka wrote:
> > > In nonblocking mode, a POSIX socket may temporarily fail to connect(),
> > > which is indicated by returning EINPROGRESS. This patch will loop
> > > waiting for connect() to succeed, or until EINPROGRESS happens.
> > >
> > > An alternative - and probably a better - path to deal with this error
> > > would be to extend unix stream_class to support connect API, that would
> > > allow the caller to repeat connect() attempts asynchronously.
> >
> > Hi Ihar,
> >
> > Yes, that probably would be better. But I expect it is also more complex.
> > So perhaps such work is appropriate as a follow-up at some point. Because,
> > assuming the patch is correct (I have some questions), it does seem to
> > improve things. And I'm all for improving things even if incrementally.
> >
> 
> This patch is clearly controversial (and I hate it), thanks for starting
> this discussion. (Perhaps I should have posted this patch from the series
> as RFC not to suggest I am happy about it. I am not.)
> 
> 
> >
> > > Signed-off-by: Ihar Hrachyshka <ihrac...@redhat.com>
> > > ---
> > >  lib/socket-util-unix.c | 14 ++++++++++----
> > >  1 file changed, 10 insertions(+), 4 deletions(-)
> > >
> > > diff --git a/lib/socket-util-unix.c b/lib/socket-util-unix.c
> > > index 0053a61b1..230705ba3 100644
> > > --- a/lib/socket-util-unix.c
> > > +++ b/lib/socket-util-unix.c
> > > @@ -361,10 +361,16 @@ make_unix_socket(int style, bool nonblock,
> > >          int dirfd;
> > >
> > >          error = make_sockaddr_un(connect_path, &un, &un_len, &dirfd,
> > linkname);
> > > -        if (!error
> > > -            && connect(fd, (struct sockaddr*) &un, un_len)
> > > -            && errno != EINPROGRESS) {
> > > -            error = errno;
> > > +        if (!error) {
> > > +            for (;;) {
> > > +                if (!connect(fd, (struct sockaddr *)&un, un_len)) {
> > > +                    break;
> > > +                }
> > > +                if (errno != EINPROGRESS) {
> > > +                    error = errno;
> > > +                    break;
> > > +                }
> > > +            }
> >
> > I have a few questions about this.
> >
> > 1. I see that patch 4/4 also takes into account EAGAIN as an
> >    alternative for EINPROGRESS, which is consistent with my
> >    reading of connect(2) wrt Unix domain sockets on Linux.
> >    And, reading connect(3p) I see that EINPROGRESS is the
> >    correct error to check for wrt POSIX.
> >
> >    But I am not entirely clear that calling connect() again
> >    is the correct behaviour. Is this clear to you?
> >
> 
> It's not. Perhaps there are bugs in the implementation beyond non-graceful
> handling of EAGAIN for Unix sockets.
> 
> For example, AFAIU, POSIX implies that if connect() returns EINPROGRESS,
> then the code should select() to check if FD is ready for write.
> (Alternatively, consequent calls to connect() may return EALREADY.)
> 
> This particular snippet from the Linux connect 2 man page describes the
> proper way to deal with EINPROGRESS:
> https://man7.org/linux/man-pages/man2/connect.2.html
> 
> ```
> After
>               select(2) indicates writability, use getsockopt(2) to read
>               the SO_ERROR option at level SOL_SOCKET to determine
>               whether connect() completed successfully (SO_ERROR is
>               zero) or unsuccessfully (SO_ERROR is one of the usual
>               error codes listed here, explaining the reason for the
>               failure).
> ```
> 
> I don't see any references to SO_ERROR in the openvswitch tree.
> 
> Then there's the Linux case for AF_INET sockets that returns EAGAIN instead
> of EINPROGRESS. In this case, I think consequent connect() calls may
> require different treatment (repeating connect attempts). What I didn't
> consider in this patch, I think, is that this EAGAIN behavior is Linux
> specific, and - probably - other Unixes may decide to return EINPROGRESS in
> the same case.
> 
> ---
> 
> Is it safe to call connect() after EAGAIN? Depends on the definition of
> "safe". I think it's "safe" to do it with the current implementation of
> AF_UNIX stream_connect in kernel, see:
> https://github.com/torvalds/linux/blob/98b1cc82c4affc16f5598d4fa14b1858671b2263/net/unix/af_unix.c#L1559
> 
> Here, once the connect() call times out and bails out to "goto out" with
> errno set to EAGAIN, it unix_release_sock(newsk, 0); the allocated
> structures for the socket.
> 
> Does it mean we can safely rely on this behavior not to change in Linux AND
> be consistent with any other systems that may decide to EAGAIN on AF_UNIX
> sockets? I dunno. :(

I think we are on the same page here :(

> > 2. Is it possible that this loop never exits?
> >    Or is otherwise abusive resource-wise.
> >
> 
> Abuse could be dealt with usleep() on each iteration (perhaps with a
> backoff). The infinite loop situation would require a mechanism to timeout
> after a particular number of attempts. We could definitely implement this.
> But - perhaps at this point it would be better to implement the "connect"
> unix class endpoint. I will take a look into it in the next revision.

Yes, that is a good point.
Perhaps rather than bandage things up, in ways that we are unclear of
the consequences, a new approach is worth investigating.

> 
> ---
> 
> I suggest we do the following for the next revision.
> 
> 1. In unix_make_connect, call connect() once. If it returns 0, tag the
> stream connected (stream->state).
> 2. If connect() returns EAGAIN or EINPROGRESS, bail out from
> unix_make_connect but return fd. Otherwise return error.
> 3. Implement connect() endpoint to unconditionally call connect(). If it
> returns EAGAIN (on Linux) or EINPROGRESS (POSIX), consider the stream is
> still connecting. If it's a different error, tag the stream disconnected.
> 
> AFAIR (1) is already done. (2) will need to know that EAGAIN is a legit
> answer from connect(), but no loop will be introduced. (3) will need to be
> written from scratch. I think check_connection_completion may also need to
> be adjusted to return EAGAIN when Linux AF_UNIX socket is still connecting.

1. Agreed
2. That is my reading of connect(2) [*]
3. I'm a little clear how the "still connecting case" will be handled

[*] https://www.man7.org/linux/man-pages/man2/connect.2.html
_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to