On Tue, Nov 2, 2021 at 10:13 AM Juan Quintela <quint...@redhat.com> wrote: > > Leonardo Bras <leob...@redhat.com> wrote: > > For CONFIG_LINUX, implement the new optional callbacks io_write_zerocopy and > > io_flush_zerocopy on QIOChannelSocket, but enables it only when MSG_ZEROCOPY > > feature is available in the host kernel, which is checked on > > qio_channel_socket_connect_sync() > > > > qio_channel_socket_writev() contents were moved to a helper function > > qio_channel_socket_writev_flags() which accepts an extra argument for flags. > > (This argument is passed directly to sendmsg(). > > > > The above helper function is used to implement qio_channel_socket_writev(), > > with flags = 0, keeping it's behavior unchanged, and > > qio_channel_socket_writev_zerocopy() with flags = MSG_ZEROCOPY. > > > > qio_channel_socket_flush_zerocopy() was implemented by counting how many > > times > > sendmsg(...,MSG_ZEROCOPY) was sucessfully called, and then reading the > > socket's error queue, in order to find how many of them finished sending. > > Flush will loop until those counters are the same, or until some error > > ocurs. > > > > A new function qio_channel_socket_poll() was also created in order to avoid > > busy-looping recvmsg() in qio_channel_socket_flush_zerocopy() while waiting > > for > > updates in socket's error queue. > > > > Notes on using writev_zerocopy(): > > 1: Buffer > > - As MSG_ZEROCOPY tells the kernel to use the same user buffer to avoid > > copying, > > some caution is necessary to avoid overwriting any buffer before it's sent. > > If something like this happen, a newer version of the buffer may be sent > > instead. > > - If this is a problem, it's recommended to call flush_zerocopy() before > > freeing > > or re-using the buffer. > > > > 2: Locked memory > > - When using MSG_ZERCOCOPY, the buffer memory will be locked after queued, > > and > > unlocked after it's sent. > > - Depending on the size of each buffer, and how often it's sent, it may > > require > > a larger amount of locked memory than usually available to non-root user. > > - If the required amount of locked memory is not available, writev_zerocopy > > will return an error, which can abort an operation like migration, > > - Because of this, when an user code wants to add zerocopy as a feature, it > > requires a mechanism to disable it, so it can still be acessible to less > > privileged users. > > > > Signed-off-by: Leonardo Bras <leob...@redhat.com> > > I think this patch would be easier to review if you split in: > - add the flags parameter left and right > - add the meat of what you do with the flags.
ok, I will try to split it like this. > > > +++ b/include/io/channel-socket.h > > @@ -47,6 +47,8 @@ struct QIOChannelSocket { > > socklen_t localAddrLen; > > struct sockaddr_storage remoteAddr; > > socklen_t remoteAddrLen; > > + ssize_t zerocopy_queued; > > + ssize_t zerocopy_sent; > > I am not sure if this is good/bad to put it inside > > #ifdef CONFIG_LINUX > > basically everything else uses it. I think it makes sense that zerocopy_{sent,queued} is inside CONFIG_LINUX as no one else is using zerocopy. > > > +#ifdef CONFIG_LINUX > > + ret = qemu_setsockopt(fd, SOL_SOCKET, SO_ZEROCOPY, &v, sizeof(v)); > > + if (ret < 0) { > > + /* Zerocopy not available on host */ > > + return 0; > > + } > > + > > + qio_channel_set_feature(QIO_CHANNEL(ioc), > > + QIO_CHANNEL_FEATURE_WRITE_ZEROCOPY); > > As Peter said, you shouldn't fail if the feature is not there. > > But on the other hand, on patch 3, you don't check that this feature > exist when you allow to enable multifd_zerocopy. This had a major rework on v5, but I will make sure this suggestion is addressed before releasing it. > > > +#endif > > + > > return 0; > > } > > > > error_setg_errno(errp, errno, > > "Unable to write to socket"); > > Why do you split this in two lines? > > Yes, I know that this file is not consistent either on how the do with > this, sometimes one line, otherwise more. IIUC, this lines have no '+' in them, so they are not my addition. > > I don't know how ZEROCPY works at kernel level to comment on rest of the > patch. > > Later, Juan. Thanks for reviewing Juan. Best regards, Leo