On Tue, Nov 22, 2022 at 03:43:55PM +0530, manish.mishra wrote: > > On 22/11/22 3:23 pm, Daniel P. Berrangé wrote: > > On Tue, Nov 22, 2022 at 03:10:53PM +0530, manish.mishra wrote: > > > On 22/11/22 2:59 pm, Daniel P. Berrangé wrote: > > > > On Tue, Nov 22, 2022 at 02:38:53PM +0530, manish.mishra wrote: > > > > > On 22/11/22 2:30 pm, Daniel P. Berrangé wrote: > > > > > > On Sat, Nov 19, 2022 at 09:36:14AM +0000, manish.mishra wrote: > > > > > > > MSG_PEEK reads from the peek of channel, The data is treated as > > > > > > > unread and the next read shall still return this data. This > > > > > > > support is currently added only for socket class. Extra parameter > > > > > > > 'flags' is added to io_readv calls to pass extra read flags like > > > > > > > MSG_PEEK. > > > > > > > > > > > > > > Suggested-by: Daniel P. Berrangé <berra...@redhat.com > > > > > > > Signed-off-by: manish.mishra <manish.mis...@nutanix.com> > > > > > > > --- > > > > > > > chardev/char-socket.c | 4 +- > > > > > > > include/io/channel.h | 83 > > > > > > > +++++++++++++++++++++++++++++ > > > > > > > io/channel-buffer.c | 1 + > > > > > > > io/channel-command.c | 1 + > > > > > > > io/channel-file.c | 1 + > > > > > > > io/channel-null.c | 1 + > > > > > > > io/channel-socket.c | 16 +++++- > > > > > > > io/channel-tls.c | 1 + > > > > > > > io/channel-websock.c | 1 + > > > > > > > io/channel.c | 73 > > > > > > > +++++++++++++++++++++++-- > > > > > > > migration/channel-block.c | 1 + > > > > > > > scsi/qemu-pr-helper.c | 2 +- > > > > > > > tests/qtest/tpm-emu.c | 2 +- > > > > > > > tests/unit/test-io-channel-socket.c | 1 + > > > > > > > util/vhost-user-server.c | 2 +- > > > > > > > 15 files changed, 179 insertions(+), 11 deletions(-) > > > > > > > diff --git a/io/channel-socket.c b/io/channel-socket.c > > > > > > > index b76dca9cc1..a06b24766d 100644 > > > > > > > --- a/io/channel-socket.c > > > > > > > +++ b/io/channel-socket.c > > > > > > > @@ -406,6 +406,8 @@ qio_channel_socket_accept(QIOChannelSocket > > > > > > > *ioc, > > > > > > > } > > > > > > > #endif /* WIN32 */ > > > > > > > + qio_channel_set_feature(QIO_CHANNEL(cioc), > > > > > > > QIO_CHANNEL_FEATURE_READ_MSG_PEEK); > > > > > > > + > > > > > > This covers the incoming server side socket. > > > > > > > > > > > > This also needs to be set in outgoing client side socket in > > > > > > qio_channel_socket_connect_async > > > > > Yes sorry, i considered only current use-case, but as it is generic > > > > > one both should be there. Thanks will update it. > > > > > > > > > > > > @@ -705,7 +718,6 @@ static ssize_t > > > > > > > qio_channel_socket_writev(QIOChannel *ioc, > > > > > > > } > > > > > > > #endif /* WIN32 */ > > > > > > > - > > > > > > > #ifdef QEMU_MSG_ZEROCOPY > > > > > > > static int qio_channel_socket_flush(QIOChannel *ioc, > > > > > > > Error **errp) > > > > > > Please remove this unrelated whitespace change. > > > > > > > > > > > > > > > > > > > @@ -109,6 +117,37 @@ int qio_channel_readv_all_eof(QIOChannel > > > > > > > *ioc, > > > > > > > return qio_channel_readv_full_all_eof(ioc, iov, niov, > > > > > > > NULL, NULL, errp); > > > > > > > } > > > > > > > +int qio_channel_readv_peek_all_eof(QIOChannel *ioc, > > > > > > > + const struct iovec *iov, > > > > > > > + size_t niov, > > > > > > > + Error **errp) > > > > > > > +{ > > > > > > > + ssize_t len = 0; > > > > > > > + ssize_t total = iov_size(iov, niov); > > > > > > > + > > > > > > > + while (len < total) { > > > > > > > + len = qio_channel_readv_full(ioc, iov, niov, NULL, > > > > > > > + NULL, > > > > > > > QIO_CHANNEL_READ_FLAG_MSG_PEEK, errp); > > > > > > > + > > > > > > > + if (len == QIO_CHANNEL_ERR_BLOCK) { > > > > > > > + if (qemu_in_coroutine()) { > > > > > > > + qio_channel_yield(ioc, G_IO_IN); > > > > > > > + } else { > > > > > > > + qio_channel_wait(ioc, G_IO_IN); > > > > > > > + } > > > > > > > + continue; > > > > > > > + } > > > > > > > + if (len == 0) { > > > > > > > + return 0; > > > > > > > + } > > > > > > > + if (len < 0) { > > > > > > > + return -1; > > > > > > > + } > > > > > > > + } > > > > > > This will busy wait burning CPU where there is a read > 0 and < > > > > > > total. > > > > > > > > > > > Daniel, i could use MSG_WAITALL too if that works but then we will > > > > > lose opportunity to yield. Or if you have some other idea. > > > > I fear this is an inherant problem with the idea of using PEEK to > > > > look at the magic data. > > > > > > > > If we actually read the magic bytes off the wire, then we could have > > > > the same code path for TLS and non-TLS. We would have to modify the > > > > existing later code paths though to take account of fact that the > > > > magic was already read by an earlier codepath. > > > > > > > > With regards, > > > > Daniel > > > > > > sure Daniel, I am happy to drop use of MSG_PEEK, but that way also we > > > have issue with tls for reason we discussed in V2. Is it okay to send > > > a patch with actual read ahead but not for tls case? tls anyway does > > > not have this bug as it does handshake. > > I've re-read the previous threads, but I don't see what the problem > > with TLS is. We already decided that TLS is not affected by the > > race condition. So there should be no problem in reading the magic > > bytes early on the TLS channels, while reading the bytes early on > > a non-TLS channel will fix the race condition. > > > Actually with tls all channels requires handshake to be assumed established, > and from source side we do initial qemu_flush only when all channels are > established. But on destination side we will stuck on reading magic for > main channel itself which never comes because source has not flushed data, > so no new connections can be established(e.g. multiFD). So basically > destination can not accept any new channel until we read from main > channel and source is not putting any data on main channel until all > channels are established. So if we read ahread in > ioc_process_incoming_channel there is this deadlock with tls. This issue > is not there with non-tls case, because there on source side we assume a > connection established once connect() call is successful.
Ah yes, I forgot about the 'flush' problem. Reading magic in non-TLS case is OK then i guess. With regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|