On Tue, Nov 2, 2021 at 10:13 AM Juan Quintela <quint...@redhat.com> wrote:
>
> Leonardo Bras <leob...@redhat.com> wrote:
> > For CONFIG_LINUX, implement the new optional callbacks io_write_zerocopy and
> > io_flush_zerocopy on QIOChannelSocket, but enables it only when MSG_ZEROCOPY
> > feature is available in the host kernel, which is checked on
> > qio_channel_socket_connect_sync()
> >
> > qio_channel_socket_writev() contents were moved to a helper function
> > qio_channel_socket_writev_flags() which accepts an extra argument for flags.
> > (This argument is passed directly to sendmsg().
> >
> > The above helper function is used to implement qio_channel_socket_writev(),
> > with flags = 0, keeping it's behavior unchanged, and
> > qio_channel_socket_writev_zerocopy() with flags = MSG_ZEROCOPY.
> >
> > qio_channel_socket_flush_zerocopy() was implemented by counting how many 
> > times
> > sendmsg(...,MSG_ZEROCOPY) was sucessfully called, and then reading the
> > socket's error queue, in order to find how many of them finished sending.
> > Flush will loop until those counters are the same, or until some error 
> > ocurs.
> >
> > A new function qio_channel_socket_poll() was also created in order to avoid
> > busy-looping recvmsg() in qio_channel_socket_flush_zerocopy() while waiting 
> > for
> > updates in socket's error queue.
> >
> > Notes on using writev_zerocopy():
> > 1: Buffer
> > - As MSG_ZEROCOPY tells the kernel to use the same user buffer to avoid 
> > copying,
> > some caution is necessary to avoid overwriting any buffer before it's sent.
> > If something like this happen, a newer version of the buffer may be sent 
> > instead.
> > - If this is a problem, it's recommended to call flush_zerocopy() before 
> > freeing
> > or re-using the buffer.
> >
> > 2: Locked memory
> > - When using MSG_ZERCOCOPY, the buffer memory will be locked after queued, 
> > and
> > unlocked after it's sent.
> > - Depending on the size of each buffer, and how often it's sent, it may 
> > require
> > a larger amount of locked memory than usually available to non-root user.
> > - If the required amount of locked memory is not available, writev_zerocopy
> > will return an error, which can abort an operation like migration,
> > - Because of this, when an user code wants to add zerocopy as a feature, it
> > requires a mechanism to disable it, so it can still be acessible to less
> > privileged users.
> >
> > Signed-off-by: Leonardo Bras <leob...@redhat.com>
>
> I think this patch would be easier to review if you split in:
> - add the flags parameter left and right
> - add the meat of what you do with the flags.

ok, I will try to split it like this.

>
> > +++ b/include/io/channel-socket.h
> > @@ -47,6 +47,8 @@ struct QIOChannelSocket {
> >      socklen_t localAddrLen;
> >      struct sockaddr_storage remoteAddr;
> >      socklen_t remoteAddrLen;
> > +    ssize_t zerocopy_queued;
> > +    ssize_t zerocopy_sent;
>
> I am not sure if this is good/bad to put it inside
>
> #ifdef CONFIG_LINUX
>
> basically everything else uses it.

I think it makes sense that zerocopy_{sent,queued} is inside
CONFIG_LINUX as no one else is using zerocopy.

>
> > +#ifdef CONFIG_LINUX
> > +    ret = qemu_setsockopt(fd, SOL_SOCKET, SO_ZEROCOPY, &v, sizeof(v));
> > +    if (ret < 0) {
> > +        /* Zerocopy not available on host */
> > +        return 0;
> > +    }
> > +
> > +    qio_channel_set_feature(QIO_CHANNEL(ioc),
> > +                            QIO_CHANNEL_FEATURE_WRITE_ZEROCOPY);
>
> As Peter said, you shouldn't fail if the feature is not there.
>
> But on the other hand, on patch 3, you don't check that this feature
> exist when you allow to enable multifd_zerocopy.

This had a major rework on v5, but I will make sure this suggestion is
addressed before releasing it.

>
> > +#endif
> > +
> >      return 0;
> >  }
>
>
> >          error_setg_errno(errp, errno,
> >                           "Unable to write to socket");
>
> Why do you split this in two lines?
>
> Yes, I know that this file is not consistent either on how the do with
> this, sometimes one line, otherwise more.

IIUC, this lines have no '+' in them, so they are not my addition.

>
> I don't know how ZEROCPY works at kernel level to comment on rest of the
> patch.
>
> Later, Juan.

Thanks for reviewing Juan.

Best regards,
Leo


Reply via email to