On Mon, Nov 07, 2022 at 04:51:59PM +0000, manish.mishra wrote:
Current logic assumes that channel connections on the destination side are
always established in the same order as the source and the first one will
always be the main channel followed by the multifid or post-copy
preemption channel. This may not be always true, as even if a channel has a
connection established on the source side it can be in the pending state on
the destination side and a newer connection can be established first.
Basically causing out of order mapping of channels on the destination side.
Currently, all channels except post-copy preempt send a magic number, this
patch uses that magic number to decide the type of channel. This logic is
applicable only for precopy(multifd) live migration, as mentioned, the
post-copy preempt channel does not send any magic number. Also, tls live
migrations already does tls handshake before creating other channels, so
this issue is not possible with tls, hence this logic is avoided for tls
live migrations. This patch uses MSG_PEEK to check the magic number of
channels so that current data/control stream management remains
un-effected.
Suggested-by: Daniel P. Berrangé<berra...@redhat.com>
Signed-off-by: manish.mishra<manish.mis...@nutanix.com>
v2:
TLS does not support MSG_PEEK, so V1 was broken for tls live
migrations. For tls live migration, while initializing main channel
tls handshake is done before we can create other channels, so this
issue is not possible for tls live migrations. In V2 added a check
to avoid checking magic number for tls live migration and fallback
to older method to decide mapping of channels on destination side.
---
include/io/channel.h | 25 +++++++++++++++++++++++
io/channel-socket.c | 27 ++++++++++++++++++++++++
io/channel.c | 39 +++++++++++++++++++++++++++++++++++
migration/migration.c | 44 +++++++++++++++++++++++++++++-----------
migration/multifd.c | 12 ++++-------
migration/multifd.h | 2 +-
migration/postcopy-ram.c | 5 +----
migration/postcopy-ram.h | 2 +-
8 files changed, 130 insertions(+), 26 deletions(-)
This should be two commits, because the 'io' and 'migration'
code are two separate subsystems in QEMU.
diff --git a/include/io/channel.h b/include/io/channel.h
index c680ee7480..74177aeeea 100644
--- a/include/io/channel.h
+++ b/include/io/channel.h
@@ -115,6 +115,10 @@ struct QIOChannelClass {
int **fds,
size_t *nfds,
Error **errp);
+ ssize_t (*io_read_peek)(QIOChannel *ioc,
+ void *buf,
+ size_t nbytes,
+ Error **errp);
This API should be called "io_read_peekv" and use
"const struct iovec *iov", such that is matches the
design of 'io_readv'.
There should also be a QIOChannelFeature flag
registered to indicate whether a given channel
impl supports peeking at data.
@@ -475,6 +479,27 @@ int qio_channel_write_all(QIOChannel *ioc,
size_t buflen,
Error **errp);
+/**
+ * qio_channel_read_peek_all:
+ * @ioc: the channel object
+ * @buf: the memory region to read in data
+ * @nbytes: the number of bytes to read
+ * @errp: pointer to a NULL-initialized error object
+ *
+ * Read given @nbytes data from peek of channel into
+ * memory region @buf.
+ *
+ * The function will be blocked until read size is
+ * equal to requested size.
+ *
+ * Returns: 1 if all bytes were read, 0 if end-of-file
+ * occurs without data, or -1 on error
+ */
+int qio_channel_read_peek_all(QIOChannel *ioc,
+ void* buf,
+ size_t nbytes,
+ Error **errp);
There should be qio_channel_read_peek, qio_channel_read_peekv,
qio_channel_read_peek_all and qio_channel_read_peekv_all.