On Tue, Mar 09, 2021 at 09:23:22PM +0100, Greg Kurz wrote:
> On Tue, 9 Mar 2021 15:17:21 +0000
> Stefan Hajnoczi <stefa...@redhat.com> wrote:
> 
> > On Mon, Mar 08, 2021 at 01:31:39PM +0100, Greg Kurz wrote:
> > > +    g_autofree int *fd = NULL;
> > > +    size_t fdsize = 0;
> > > +    int i;
> > >  
> > >      /* Read header */
> > >      iov.iov_base = &hdr;
> > >      iov.iov_len = VHOST_USER_HDR_SIZE;
> > >  
> > >      do {
> > > -        size = recvmsg(u->slave_fd, &msgh, 0);
> > > -    } while (size < 0 && (errno == EINTR || errno == EAGAIN));
> > > +        size = qio_channel_readv_full(ioc, &iov, 1, &fd, &fdsize, NULL);
> > > +    } while (size == QIO_CHANNEL_ERR_BLOCK);
> > 
> > Is it possible to leak file descriptors and fd[] memory if we receive a
> > short message and then loop around? qio_channel_readv_full() will
> > overwrite &fd and that's how the leak occurs.
> > 
> 
> qio_channel_readv_full() only returns QIO_CHANNEL_ERR_BLOCK when the
> channel is non-blocking mode and no data is available. Under the hood,
> this translates to this code in qio_channel_socket_readv():
> 
>  retry:
>     ret = recvmsg(sioc->fd, &msg, sflags);
>     if (ret < 0) {
>         if (errno == EAGAIN) {
>             return QIO_CHANNEL_ERR_BLOCK;
>         }
>         if (errno == EINTR) {
>             goto retry;
>         }
> 
>         error_setg_errno(errp, errno,
>                          "Unable to read from socket");
>         return -1;
>     }
> 
> This is strictly equivalent to what we currently have. This cannot
> leak file descriptors because we would only loop until the first
> byte of real data is available and ancillary data cannot fly without
> real data, i.e. no file descriptors when recvmsg() returns EAGAIN.
> 
> > On the other hand, it looks like ioc is in blocking mode. I'm not sure
> > QIO_CHANNEL_ERR_BLOCK can occur?
> > 
> 
> You're right but the existing code is ready to handle the non-blocking
> case, so I just kept this behavior.

I'm confused by this tentative non-blocking support because if we set
the fd to non-blocking, then qio_channel_readv_full() can return less
than size bytes (a short read) and that will cause a failure:

  if (size != VHOST_USER_HDR_SIZE) {
      error_report("Failed to read from slave.");
      goto err;
  }

So although the while QIO_CHANNEL_ERR_BLOCK loop suggests the code
supports non-blocking, it doesn't really support it :).

I think it would be clearer to remove the while QIO_CHANNEL_ERR_BLOCK
loop. However, this is not directly related to the bug that this series
fixes, so if you prefer to keep it, feel free.

Attachment: signature.asc
Description: PGP signature

Reply via email to