Re: [PATCH-for-9.1 v2 2/3] migration: Remove RDMA protocol handling
On Fri, Mar 29, 2024 at 11:28:54AM +0100, Philippe Mathieu-Daudé wrote: > Hi Zhijian, > > On 29/3/24 02:53, Zhijian Li (Fujitsu) wrote: > > > > > > On 28/03/2024 23:01, Peter Xu wrote: > > > On Thu, Mar 28, 2024 at 11:18:04AM -0300, Fabiano Rosas wrote: > > > > Philippe Mathieu-Daudé writes: > > > > > > > > > The whole RDMA subsystem was deprecated in commit e9a54265f5 > > > > > ("hw/rdma: Deprecate the pvrdma device and the rdma subsystem") > > > > > released in v8.2. > > > > > > > > > > Remove: > > > > >- RDMA handling from migration > > > > >- dependencies on libibumad, libibverbs and librdmacm > > > > > > > > > > Keep the RAM_SAVE_FLAG_HOOK definition since it might appears > > > > > in old migration streams. > > > > > > > > > > Cc: Peter Xu > > > > > Cc: Li Zhijian > > > > > Acked-by: Fabiano Rosas > > > > > Signed-off-by: Philippe Mathieu-Daudé > > > > > > > > Just to be clear, because people raised the point in the last version, > > > > the first link in the deprecation commit links to a thread comprising > > > > entirely of rdma migration patches. I don't see any ambiguity on whether > > > > the deprecation was intended to include migration. There's even an ack > > > > from Juan. > > > > > > Yes I remember that's the plan. > > > > > > > > > > > So on the basis of not reverting the previous maintainer's decision, my > > > > Ack stands here. > > > > > > > > We also had pretty obvious bugs ([1], [2]) in the past that would have > > > > been caught if we had any kind of testing for the feature, so I can't > > > > even say this thing works currently. > > > > > > > > @Peter Xu, @Li Zhijian, what are your thoughts on this? > > > > > > Generally I definitely agree with such a removal sooner or later, as > > > that's > > > how deprecation works, and even after Juan's left I'm not aware of any > > > other new RDMA users. Personally, I'd slightly prefer postponing it one > > > more release which might help a bit of our downstream maintenance, however > > > I assume that's not a blocker either, as I think we can also manage it. > > > > > > IMHO it's more important to know whether there are still users and whether > > > they would still like to see it around. That's also one thing I notice > > > that > > > e9a54265f533f didn't yet get acks from RDMA users that we are aware, even > > > if they're rare. According to [2] it could be that such user may only rely > > > on the release versions of QEMU when it broke things. > > > > > > So I'm copying Yu too (while Zhijian is already in the loop), just in case > > > someone would like to stand up and speak. > > > > > > I admit RDMA migration was lack of testing(unit/CI test), which led to the > > a few > > obvious bugs being noticed too late. > > However I was a bit surprised when I saw the removal of the RDMA migration. > > I wasn't > > aware that this feature has not been marked as deprecated(at least there is > > no > > prompt to end-user). > > > > > > > IMHO it's more important to know whether there are still users and whether > > > they would still like to see it around. > > > > Agree. > > I didn't immediately express my opinion in V1 because I'm also consulting > > our > > customers for this feature in the future. > > > > Personally, I agree with Perter's idea that "I'd slightly prefer postponing > > it one > > more release which might help a bit of our downstream maintenance" > > Do you mind posting a deprecation patch to clarify the situation? The key thing the first deprecation patch missed was that it failed to issue a warning message when RDMA migration was actually used. With regards, Daniel -- |: https://berrange.com -o-https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o-https://fstop138.berrange.com :| |: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|
Re: [PATCH for-9.1] migration: Add Error** argument to add_bitmaps_to_list()
On 29.03.24 13:56, Cédric Le Goater wrote: This allows to report more precise errors in the migration handler dirty_bitmap_save_setup(). Suggested-by Vladimir Sementsov-Ogievskiy Signed-off-by: Cédric Le Goater Reviewed-by: Vladimir Sementsov-Ogievskiy -- Best regards, Vladimir
[PATCH v3 2/5] qdev-monitor: fix error message in find_device_state()
This "hotpluggable" here is misleading. Actually we check is object a device or not. Let's drop the word. Suggested-by: Markus Armbruster Signed-off-by: Vladimir Sementsov-Ogievskiy Reviewed-by: Markus Armbruster --- system/qdev-monitor.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/system/qdev-monitor.c b/system/qdev-monitor.c index c1243891c3..840177d19f 100644 --- a/system/qdev-monitor.c +++ b/system/qdev-monitor.c @@ -891,7 +891,7 @@ static DeviceState *find_device_state(const char *id, Error **errp) dev = (DeviceState *)object_dynamic_cast(obj, TYPE_DEVICE); if (!dev) { -error_setg(errp, "%s is not a hotpluggable device", id); +error_setg(errp, "%s is not a device", id); return NULL; } -- 2.34.1
[PATCH v3 0/5] vhost-user-blk: live resize additional APIs
v3: 02: add r-b by Markus 03: improve commit message 04: improve documentation, merge race-fix here (which was v2:05), rebase on master (migration_is_running() now without arguments) 05: improve documentation Vladimir Sementsov-Ogievskiy (5): vhost-user-blk: simplify and fix vhost_user_blk_handle_config_change qdev-monitor: fix error message in find_device_state() qdev-monitor: add option to report GenericError from find_device_state qapi: introduce device-sync-config qapi: introduce CONFIG_READ event hw/block/vhost-user-blk.c | 32 +++--- hw/virtio/virtio-pci.c| 18 ++ include/hw/qdev-core.h| 3 ++ include/monitor/qdev.h| 2 ++ include/sysemu/runstate.h | 1 + monitor/monitor.c | 1 + qapi/qdev.json| 54 ++ stubs/qdev.c | 6 system/qdev-monitor.c | 70 --- system/runstate.c | 5 +++ 10 files changed, 175 insertions(+), 17 deletions(-) -- 2.34.1
[PATCH v3 3/5] qdev-monitor: add option to report GenericError from find_device_state
Here we just prepare for the following patch, making possible to report GenericError as recommended. This patch doesn't aim to prevent further use of DeviceNotFound by future interfaces: - find_device_state() is used in blk_by_qdev_id() and qmp_get_blk() functions, which may lead to spread of DeviceNotFound anyway - also, nothing prevent simply copy-pasting find_device_state() calls with false argument Signed-off-by: Vladimir Sementsov-Ogievskiy --- system/qdev-monitor.c | 15 +++ 1 file changed, 11 insertions(+), 4 deletions(-) diff --git a/system/qdev-monitor.c b/system/qdev-monitor.c index 840177d19f..7e075d91c1 100644 --- a/system/qdev-monitor.c +++ b/system/qdev-monitor.c @@ -878,13 +878,20 @@ void qmp_device_add(QDict *qdict, QObject **ret_data, Error **errp) object_unref(OBJECT(dev)); } -static DeviceState *find_device_state(const char *id, Error **errp) +/* + * Note that creating new APIs using error classes other than GenericError is + * not recommended. Set use_generic_error=true for new interfaces. + */ +static DeviceState *find_device_state(const char *id, bool use_generic_error, + Error **errp) { Object *obj = object_resolve_path_at(qdev_get_peripheral(), id); DeviceState *dev; if (!obj) { -error_set(errp, ERROR_CLASS_DEVICE_NOT_FOUND, +error_set(errp, + (use_generic_error ? + ERROR_CLASS_GENERIC_ERROR : ERROR_CLASS_DEVICE_NOT_FOUND), "Device '%s' not found", id); return NULL; } @@ -948,7 +955,7 @@ void qdev_unplug(DeviceState *dev, Error **errp) void qmp_device_del(const char *id, Error **errp) { -DeviceState *dev = find_device_state(id, errp); +DeviceState *dev = find_device_state(id, false, errp); if (dev != NULL) { if (dev->pending_deleted_event && (dev->pending_deleted_expires_ms == 0 || @@ -1068,7 +1075,7 @@ BlockBackend *blk_by_qdev_id(const char *id, Error **errp) GLOBAL_STATE_CODE(); -dev = find_device_state(id, errp); +dev = find_device_state(id, false, errp); if (dev == NULL) { return NULL; } -- 2.34.1
[PATCH v3 5/5] qapi: introduce CONFIG_READ event
Send a new event when guest reads virtio-pci config after virtio_notify_config() call. That's useful to check that guest fetched modified config, for example after resizing disk backend. Signed-off-by: Vladimir Sementsov-Ogievskiy --- hw/virtio/virtio-pci.c | 9 + include/monitor/qdev.h | 2 ++ monitor/monitor.c | 1 + qapi/qdev.json | 33 + stubs/qdev.c | 6 ++ system/qdev-monitor.c | 6 ++ 6 files changed, 57 insertions(+) diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c index 92afbae71c..c0c158dae2 100644 --- a/hw/virtio/virtio-pci.c +++ b/hw/virtio/virtio-pci.c @@ -23,6 +23,7 @@ #include "hw/boards.h" #include "hw/virtio/virtio.h" #include "migration/qemu-file-types.h" +#include "monitor/qdev.h" #include "hw/pci/pci.h" #include "hw/pci/pci_bus.h" #include "hw/qdev-properties.h" @@ -530,6 +531,10 @@ static uint64_t virtio_pci_config_read(void *opaque, hwaddr addr, } addr -= config; +if (vdev->generation > 0) { +qdev_virtio_config_read_event(DEVICE(proxy)); +} + switch (size) { case 1: val = virtio_config_readb(vdev, addr); @@ -1884,6 +1889,10 @@ static uint64_t virtio_pci_device_read(void *opaque, hwaddr addr, return UINT64_MAX; } +if (vdev->generation > 0) { +qdev_virtio_config_read_event(DEVICE(proxy)); +} + switch (size) { case 1: val = virtio_config_modern_readb(vdev, addr); diff --git a/include/monitor/qdev.h b/include/monitor/qdev.h index 1d57bf6577..fc9a834dca 100644 --- a/include/monitor/qdev.h +++ b/include/monitor/qdev.h @@ -36,4 +36,6 @@ DeviceState *qdev_device_add_from_qdict(const QDict *opts, */ const char *qdev_set_id(DeviceState *dev, char *id, Error **errp); +void qdev_virtio_config_read_event(DeviceState *dev); + #endif diff --git a/monitor/monitor.c b/monitor/monitor.c index 01ede1babd..5b06146503 100644 --- a/monitor/monitor.c +++ b/monitor/monitor.c @@ -316,6 +316,7 @@ static MonitorQAPIEventConf monitor_qapi_event_conf[QAPI_EVENT__MAX] = { [QAPI_EVENT_VSERPORT_CHANGE] = { 1000 * SCALE_MS }, [QAPI_EVENT_MEMORY_DEVICE_SIZE_CHANGE] = { 1000 * SCALE_MS }, [QAPI_EVENT_HV_BALLOON_STATUS_REPORT] = { 1000 * SCALE_MS }, +[QAPI_EVENT_VIRTIO_CONFIG_READ] = { 300 * SCALE_MS }, }; /* diff --git a/qapi/qdev.json b/qapi/qdev.json index e8be79c3d5..29a4f47360 100644 --- a/qapi/qdev.json +++ b/qapi/qdev.json @@ -182,3 +182,36 @@ { 'command': 'device-sync-config', 'features': [ 'unstable' ], 'data': {'id': 'str'} } + +## +# @VIRTIO_CONFIG_READ: +# +# Emitted whenever guest reads virtio device configuration after +# configuration change. +# +# The event may be used in pair with device-sync-config. It shows +# that guest has re-read updated configuration. It doesn't +# guarantee that guest successfully handled it and updated the +# view of the device for the user, but still it's a kind of +# success indicator. +# +# @device: device name +# +# @path: device path +# +# Features: +# +# @unstable: The event is experimental. +# +# Since: 9.1 +# +# Example: +# +# <- { "event": "VIRTIO_CONFIG_READ", +# "data": { "device": "virtio-net-pci-0", +#"path": "/machine/peripheral/virtio-net-pci-0" }, +# "timestamp": { "seconds": 1265044230, "microseconds": 450486 } } +## +{ 'event': 'VIRTIO_CONFIG_READ', + 'features': [ 'unstable' ], + 'data': { '*device': 'str', 'path': 'str' } } diff --git a/stubs/qdev.c b/stubs/qdev.c index 6869f6f90a..ab6c4afe0b 100644 --- a/stubs/qdev.c +++ b/stubs/qdev.c @@ -26,3 +26,9 @@ void qapi_event_send_device_unplug_guest_error(const char *device, { /* Nothing to do. */ } + +void qapi_event_send_virtio_config_read(const char *device, +const char *path) +{ +/* Nothing to do. */ +} diff --git a/system/qdev-monitor.c b/system/qdev-monitor.c index cb35ea0b86..8a2ca77fde 100644 --- a/system/qdev-monitor.c +++ b/system/qdev-monitor.c @@ -26,6 +26,7 @@ #include "sysemu/runstate.h" #include "qapi/error.h" #include "qapi/qapi-commands-qdev.h" +#include "qapi/qapi-events-qdev.h" #include "qapi/qmp/dispatch.h" #include "qapi/qmp/qdict.h" #include "qapi/qmp/qerror.h" @@ -1206,3 +1207,8 @@ bool qmp_command_available(const QmpCommand *cmd, Error **errp) } return true; } + +void qdev_virtio_config_read_event(DeviceState *dev) +{ +qapi_event_send_virtio_config_read(dev->id, dev->canonical_path); +} -- 2.34.1
[PATCH v3 4/5] qapi: introduce device-sync-config
Add command to sync config from vhost-user backend to the device. It may be helpful when VHOST_USER_SLAVE_CONFIG_CHANGE_MSG failed or not triggered interrupt to the guest or just not available (not supported by vhost-user server). Command result is racy if allow it during migration. Let's allow the sync only in RUNNING state. Signed-off-by: Vladimir Sementsov-Ogievskiy --- hw/block/vhost-user-blk.c | 27 -- hw/virtio/virtio-pci.c| 9 include/hw/qdev-core.h| 3 +++ include/sysemu/runstate.h | 1 + qapi/qdev.json| 21 + system/qdev-monitor.c | 47 +++ system/runstate.c | 5 + 7 files changed, 106 insertions(+), 7 deletions(-) diff --git a/hw/block/vhost-user-blk.c b/hw/block/vhost-user-blk.c index 9e6bbc6950..2f301f380c 100644 --- a/hw/block/vhost-user-blk.c +++ b/hw/block/vhost-user-blk.c @@ -88,27 +88,39 @@ static void vhost_user_blk_set_config(VirtIODevice *vdev, const uint8_t *config) s->blkcfg.wce = blkcfg->wce; } +static int vhost_user_blk_sync_config(DeviceState *dev, Error **errp) +{ +int ret; +VirtIODevice *vdev = VIRTIO_DEVICE(dev); +VHostUserBlk *s = VHOST_USER_BLK(vdev); + +ret = vhost_dev_get_config(&s->dev, (uint8_t *)&s->blkcfg, + vdev->config_len, errp); +if (ret < 0) { +return ret; +} + +memcpy(vdev->config, &s->blkcfg, vdev->config_len); +virtio_notify_config(vdev); + +return 0; +} + static int vhost_user_blk_handle_config_change(struct vhost_dev *dev) { int ret; -VirtIODevice *vdev = dev->vdev; -VHostUserBlk *s = VHOST_USER_BLK(dev->vdev); Error *local_err = NULL; if (!dev->started) { return 0; } -ret = vhost_dev_get_config(dev, (uint8_t *)&s->blkcfg, - vdev->config_len, &local_err); +ret = vhost_user_blk_sync_config(DEVICE(dev->vdev), &local_err); if (ret < 0) { error_report_err(local_err); return ret; } -memcpy(dev->vdev->config, &s->blkcfg, vdev->config_len); -virtio_notify_config(dev->vdev); - return 0; } @@ -576,6 +588,7 @@ static void vhost_user_blk_class_init(ObjectClass *klass, void *data) device_class_set_props(dc, vhost_user_blk_properties); dc->vmsd = &vmstate_vhost_user_blk; +dc->sync_config = vhost_user_blk_sync_config; set_bit(DEVICE_CATEGORY_STORAGE, dc->categories); vdc->realize = vhost_user_blk_device_realize; vdc->unrealize = vhost_user_blk_device_unrealize; diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c index eaaf86402c..92afbae71c 100644 --- a/hw/virtio/virtio-pci.c +++ b/hw/virtio/virtio-pci.c @@ -2501,6 +2501,14 @@ static void virtio_pci_dc_realize(DeviceState *qdev, Error **errp) vpciklass->parent_dc_realize(qdev, errp); } +static int virtio_pci_sync_config(DeviceState *dev, Error **errp) +{ +VirtIOPCIProxy *proxy = VIRTIO_PCI(dev); +VirtIODevice *vdev = virtio_bus_get_device(&proxy->bus); + +return qdev_sync_config(DEVICE(vdev), errp); +} + static void virtio_pci_class_init(ObjectClass *klass, void *data) { DeviceClass *dc = DEVICE_CLASS(klass); @@ -2517,6 +2525,7 @@ static void virtio_pci_class_init(ObjectClass *klass, void *data) device_class_set_parent_realize(dc, virtio_pci_dc_realize, &vpciklass->parent_dc_realize); rc->phases.hold = virtio_pci_bus_reset_hold; +dc->sync_config = virtio_pci_sync_config; } static const TypeInfo virtio_pci_info = { diff --git a/include/hw/qdev-core.h b/include/hw/qdev-core.h index 9228e96c87..87135bdcdf 100644 --- a/include/hw/qdev-core.h +++ b/include/hw/qdev-core.h @@ -95,6 +95,7 @@ typedef void (*DeviceUnrealize)(DeviceState *dev); typedef void (*DeviceReset)(DeviceState *dev); typedef void (*BusRealize)(BusState *bus, Error **errp); typedef void (*BusUnrealize)(BusState *bus); +typedef int (*DeviceSyncConfig)(DeviceState *dev, Error **errp); /** * struct DeviceClass - The base class for all devices. @@ -162,6 +163,7 @@ struct DeviceClass { DeviceReset reset; DeviceRealize realize; DeviceUnrealize unrealize; +DeviceSyncConfig sync_config; /** * @vmsd: device state serialisation description for @@ -546,6 +548,7 @@ bool qdev_hotplug_allowed(DeviceState *dev, Error **errp); */ HotplugHandler *qdev_get_hotplug_handler(DeviceState *dev); void qdev_unplug(DeviceState *dev, Error **errp); +int qdev_sync_config(DeviceState *dev, Error **errp); void qdev_simple_device_unplug_cb(HotplugHandler *hotplug_dev, DeviceState *dev, Error **errp); void qdev_machine_creation_done(void); diff --git a/include/sysemu/runstate.h b/include/sysemu/runstate.h index 0117d243c4..296af52322 100644 --- a/include/sysemu/runstate.h +++ b/include/sysemu/runstate.h @@ -5,6 +5,7 @@ #include "qemu/notify.h" bool runstate_c
[PATCH v3 1/5] vhost-user-blk: simplify and fix vhost_user_blk_handle_config_change
Let's not care about what was changed and update the whole config, reasons: 1. config->geometry should be updated together with capacity, so we fix a bug. 2. Vhost-user protocol doesn't say anything about config change limitation. Silent ignore of changes doesn't seem to be correct. 3. vhost-user-vsock reads the whole config 4. on realize we don't do any checks on retrieved config, so no reason to care here Comment "valid for resize only" exists since introduction the whole hw/block/vhost-user-blk.c in commit 00343e4b54ba0685e9ebe928ec5713b0cf7f1d1c "vhost-user-blk: introduce a new vhost-user-blk host device", seems it was just an extra limitation. Also, let's notify guest unconditionally: 1. So does vhost-user-vsock 2. We are going to reuse the functionality in new cases when we do want to notify the guest unconditionally. So, no reason to create extra branches in the logic. Signed-off-by: Vladimir Sementsov-Ogievskiy Acked-by: Raphael Norwitz --- hw/block/vhost-user-blk.c | 11 +++ 1 file changed, 3 insertions(+), 8 deletions(-) diff --git a/hw/block/vhost-user-blk.c b/hw/block/vhost-user-blk.c index 6a856ad51a..9e6bbc6950 100644 --- a/hw/block/vhost-user-blk.c +++ b/hw/block/vhost-user-blk.c @@ -91,7 +91,6 @@ static void vhost_user_blk_set_config(VirtIODevice *vdev, const uint8_t *config) static int vhost_user_blk_handle_config_change(struct vhost_dev *dev) { int ret; -struct virtio_blk_config blkcfg; VirtIODevice *vdev = dev->vdev; VHostUserBlk *s = VHOST_USER_BLK(dev->vdev); Error *local_err = NULL; @@ -100,19 +99,15 @@ static int vhost_user_blk_handle_config_change(struct vhost_dev *dev) return 0; } -ret = vhost_dev_get_config(dev, (uint8_t *)&blkcfg, +ret = vhost_dev_get_config(dev, (uint8_t *)&s->blkcfg, vdev->config_len, &local_err); if (ret < 0) { error_report_err(local_err); return ret; } -/* valid for resize only */ -if (blkcfg.capacity != s->blkcfg.capacity) { -s->blkcfg.capacity = blkcfg.capacity; -memcpy(dev->vdev->config, &s->blkcfg, vdev->config_len); -virtio_notify_config(dev->vdev); -} +memcpy(dev->vdev->config, &s->blkcfg, vdev->config_len); +virtio_notify_config(dev->vdev); return 0; } -- 2.34.1
Re: [PATCH v1] coroutine: avoid inserting duplicate coroutine to co_queue_wakeup
On Thu, 28 Mar 2024 07:40:14 -0500, Eric Blake wrote: > On Mon, Mar 25, 2024 at 05:18:50PM +0800, zhuyangyang via wrote: > > If g_main_loop_run()/aio_poll() is called in the coroutine context, > > the pending coroutine may be woken up repeatedly, and the co_queue_wakeup > > may be disordered. > > > > When the poll() syscall exited in g_main_loop_run()/aio_poll(), it means > > some listened events is completed. Therefore, the completion callback > > function is dispatched. > > > > If this callback function needs to invoke aio_co_enter(), it will only > > wake up the coroutine (because we are already in coroutine context), > > which may cause that the data on this listening event_fd/socket_fd > > is not read/cleared. When the next poll () exits, it will be woken up again > > and inserted into the wakeup queue again. > > > > For example, if TLS is enabled in NBD, the server will call > > g_main_loop_run() > > in the coroutine, and repeatedly wake up the io_read event on a socket. > > The call stack is as follows: > > > > aio_co_enter() > > aio_co_wake() > > qio_channel_restart_read() > > aio_dispatch_handler() > > aio_dispatch_handlers() > > aio_dispatch() > > aio_ctx_dispatch() > > g_main_context_dispatch() > > g_main_loop_run() > > nbd_negotiate_handle_starttls() > > nbd_negotiate_options() > > nbd_negotiate() > > nbd_co_client_start() > > coroutine_trampoline() > > zhuyangyang, do you have a reliable reproduction setup for how you > were able to trigger this? Obviously, it only happens when TLS is > enabled (we aren't creating a g_main_loop_run for any other NBD > command), and only when the server is first starting to serve a > client; is this a case where you were hammering a long-running qemu > process running an NBD server with multiple clients trying to > reconnect to the server all near the same time? I'm sorry I didn't make the background of the problem clear before, this problem is not on the NBD command, but on the VM live migration with qemu TLS. Next, I'll detail how to reproduce the issue. 1. Make the problem more obvious. When TLS is enabled during live migration, the migration progress may be suspended because some I/O are not returned during the mirror job on target host. Now we know that the reason is that some coroutines are lost. The entry function of these lost coroutines are nbd_trip(). Add an assertion on the target host side to make the problem show up quickly. $ git diff util/async.c diff --git a/util/async.c b/util/async.c index 0467890052..4e3547c3ea 100644 --- a/util/async.c +++ b/util/async.c @@ -705,6 +705,7 @@ void aio_co_enter(AioContext *ctx, Coroutine *co) if (qemu_in_coroutine()) { Coroutine *self = qemu_coroutine_self(); assert(self != co); +assert(!co->co_queue_next.sqe_next); QSIMPLEQ_INSERT_TAIL(&self->co_queue_wakeup, co, co_queue_next); } else { qemu_aio_coroutine_enter(ctx, co); 2. Reproduce the issue 1) start vm on the origin host Note: Configuring multiple disks for a VM (It is recommended to configure more than 6 disks) These disk tasks(nbd_trip) will be performed simultaneously with nbd_negotiate_handle_starttls() on the main thread during migrate. centos7.3_64_server 4194304 4194304 4 /machine hvm destroy restart restart /usr/bin/qemu-kvm $ virsh create vm_x86.xml Domain 'centos7.3_64_server' created from /home/vm_x86.xml 2) migrate the vm to target host virsh migrate --live --p2p \ --migrateuri tcp:10.91.xxx.xxx centos7.3_64_server qemu+tcp://10.91.xxx.xxx/system \ --copy-storage-all \ --tls Than, An error is reported on the peer host qemu-kvm: ../util/async.c:705: aio_co_enter: Assertion `!co->co_queue_next.sqe_next' failed. > > If we can come up with a reliable formula for reproducing the > corrupted coroutine list, it would make a great iotest addition > alongside the existing qemu-iotests 233 for ensuring that NBD TLS > traffic is handled correctly in both server and client. I'm not sure if this can be used for testing of qemu-nbd > > > > > Signed-off-by: zhuyangyang > > Side note: this appears to be your first qemu contribution (based on > 'git shortlog --author zhuyangyang'). While I am not in a position to > presume how you would like your name Anglicized, I will point out that > the prevailing style is to separate given name from family name (just > because your username at work has no spaces does not mean that your > S-o-b has to follow suit). It is also permissible to list your name > in native characters alongside or in place of the Ang
Re: [PATCH for-9.1] migration: Add Error** argument to add_bitmaps_to_list()
On 29/3/24 11:56, Cédric Le Goater wrote: This allows to report more precise errors in the migration handler dirty_bitmap_save_setup(). Suggested-by Vladimir Sementsov-Ogievskiy Signed-off-by: Cédric Le Goater --- To apply on top of : https://lore.kernel.org/qemu-devel/20240320064911.545001-1-...@redhat.com/ migration/block-dirty-bitmap.c | 34 -- 1 file changed, 16 insertions(+), 18 deletions(-) Reviewed-by: Philippe Mathieu-Daudé
Re: [PATCH for-9.1 v5 07/14] migration: Add Error** argument to .save_setup() handler
On 29.03.24 13:53, Cédric Le Goater wrote: Hello Vladimir, On 3/29/24 10:32, Vladimir Sementsov-Ogievskiy wrote: On 20.03.24 09:49, Cédric Le Goater wrote: diff --git a/migration/block-dirty-bitmap.c b/migration/block-dirty-bitmap.c index 2708abf3d762de774ed294d3fdb8e56690d2974c..542a8c297b329abc30d1b3a205d29340fa59a961 100644 --- a/migration/block-dirty-bitmap.c +++ b/migration/block-dirty-bitmap.c @@ -1213,12 +1213,14 @@ fail: return ret; } -static int dirty_bitmap_save_setup(QEMUFile *f, void *opaque) +static int dirty_bitmap_save_setup(QEMUFile *f, void *opaque, Error **errp) { DBMSaveState *s = &((DBMState *)opaque)->save; SaveBitmapState *dbms = NULL; if (init_dirty_bitmap_migration(s) < 0) { + error_setg(errp, + "Failed to initialize dirty tracking bitmap for blocks"); No, that's not about initializing a bitmap. This all is about migration of block-dirty-bitmaps themselves. So correct would be say "Failed to initialize migration of block dirty bitmaps". with this, for block dirty bitmap migration: Acked-by: Vladimir Sementsov-Ogievskiy I had kept your previous R-b. Oh, I missed that) Should we remove it ? or is it ok if I address your comments below in a followup patch, in which case the error message above would be removed. Yes of course, you can keep my old r-b. Followup patch is appreciated Still, a lot better is add errp to init_dirty_bitmap_migration() and to add_bitmaps_to_list() too: look, init_dirty_bitmap_migration() fails only if add_bitmaps_to_list() fails in turn, add_bitmaps_to_list() have several clear failure points, where it always does error_report (or error_report_err), which would be better to pass-through to the user. Good idea. Will do. Thanks, C. -- Best regards, Vladimir
[PATCH for-9.1] migration: Add Error** argument to add_bitmaps_to_list()
This allows to report more precise errors in the migration handler dirty_bitmap_save_setup(). Suggested-by Vladimir Sementsov-Ogievskiy Signed-off-by: Cédric Le Goater --- To apply on top of : https://lore.kernel.org/qemu-devel/20240320064911.545001-1-...@redhat.com/ migration/block-dirty-bitmap.c | 34 -- 1 file changed, 16 insertions(+), 18 deletions(-) diff --git a/migration/block-dirty-bitmap.c b/migration/block-dirty-bitmap.c index 542a8c297b329abc30d1b3a205d29340fa59a961..a7d55048c23505fde565ca784cec3c917dca37e5 100644 --- a/migration/block-dirty-bitmap.c +++ b/migration/block-dirty-bitmap.c @@ -481,13 +481,13 @@ static void dirty_bitmap_do_save_cleanup(DBMSaveState *s) /* Called with the BQL taken. */ static int add_bitmaps_to_list(DBMSaveState *s, BlockDriverState *bs, - const char *bs_name, GHashTable *alias_map) + const char *bs_name, GHashTable *alias_map, + Error **errp) { BdrvDirtyBitmap *bitmap; SaveBitmapState *dbms; GHashTable *bitmap_aliases; const char *node_alias, *bitmap_name, *bitmap_alias; -Error *local_err = NULL; /* When an alias map is given, @bs_name must be @bs's node name */ assert(!alias_map || !strcmp(bs_name, bdrv_get_node_name(bs))); @@ -504,8 +504,8 @@ static int add_bitmaps_to_list(DBMSaveState *s, BlockDriverState *bs, bitmap_name = bdrv_dirty_bitmap_name(bitmap); if (!bs_name || strcmp(bs_name, "") == 0) { -error_report("Bitmap '%s' in unnamed node can't be migrated", - bitmap_name); +error_setg(errp, "Bitmap '%s' in unnamed node can't be migrated", + bitmap_name); return -1; } @@ -525,9 +525,9 @@ static int add_bitmaps_to_list(DBMSaveState *s, BlockDriverState *bs, } if (node_alias[0] == '#') { -error_report("Bitmap '%s' in a node with auto-generated " - "name '%s' can't be migrated", - bitmap_name, node_alias); +error_setg(errp, "Bitmap '%s' in a node with auto-generated " + "name '%s' can't be migrated", + bitmap_name, node_alias); return -1; } @@ -538,8 +538,7 @@ static int add_bitmaps_to_list(DBMSaveState *s, BlockDriverState *bs, continue; } -if (bdrv_dirty_bitmap_check(bitmap, BDRV_BITMAP_DEFAULT, &local_err)) { -error_report_err(local_err); +if (bdrv_dirty_bitmap_check(bitmap, BDRV_BITMAP_DEFAULT, errp)) { return -1; } @@ -558,9 +557,9 @@ static int add_bitmaps_to_list(DBMSaveState *s, BlockDriverState *bs, } } else { if (strlen(bitmap_name) > UINT8_MAX) { -error_report("Cannot migrate bitmap '%s' on node '%s': " - "Name is longer than %u bytes", - bitmap_name, bs_name, UINT8_MAX); +error_setg(errp, "Cannot migrate bitmap '%s' on node '%s': " + "Name is longer than %u bytes", + bitmap_name, bs_name, UINT8_MAX); return -1; } bitmap_alias = bitmap_name; @@ -599,7 +598,7 @@ static int add_bitmaps_to_list(DBMSaveState *s, BlockDriverState *bs, } /* Called with the BQL taken. */ -static int init_dirty_bitmap_migration(DBMSaveState *s) +static int init_dirty_bitmap_migration(DBMSaveState *s, Error **errp) { BlockDriverState *bs; SaveBitmapState *dbms; @@ -643,7 +642,7 @@ static int init_dirty_bitmap_migration(DBMSaveState *s) } if (bs && bs->drv && !bs->drv->is_filter) { -if (add_bitmaps_to_list(s, bs, name, NULL)) { +if (add_bitmaps_to_list(s, bs, name, NULL, errp)) { goto fail; } g_hash_table_add(handled_by_blk, bs); @@ -656,7 +655,8 @@ static int init_dirty_bitmap_migration(DBMSaveState *s) continue; } -if (add_bitmaps_to_list(s, bs, bdrv_get_node_name(bs), alias_map)) { +if (add_bitmaps_to_list(s, bs, bdrv_get_node_name(bs), alias_map, +errp)) { goto fail; } } @@ -1218,9 +1218,7 @@ static int dirty_bitmap_save_setup(QEMUFile *f, void *opaque, Error **errp) DBMSaveState *s = &((DBMState *)opaque)->save; SaveBitmapState *dbms = NULL; -if (init_dirty_bitmap_migration(s) < 0) { -error_setg(errp, - "Failed to initialize dirty tracking bitmap for blocks"); +if (init_dirty_bitmap_migration(s, errp) < 0) { return -1; } -- 2.44.0
Re: [PATCH for-9.1 v5 07/14] migration: Add Error** argument to .save_setup() handler
Hello Vladimir, On 3/29/24 10:32, Vladimir Sementsov-Ogievskiy wrote: On 20.03.24 09:49, Cédric Le Goater wrote: diff --git a/migration/block-dirty-bitmap.c b/migration/block-dirty-bitmap.c index 2708abf3d762de774ed294d3fdb8e56690d2974c..542a8c297b329abc30d1b3a205d29340fa59a961 100644 --- a/migration/block-dirty-bitmap.c +++ b/migration/block-dirty-bitmap.c @@ -1213,12 +1213,14 @@ fail: return ret; } -static int dirty_bitmap_save_setup(QEMUFile *f, void *opaque) +static int dirty_bitmap_save_setup(QEMUFile *f, void *opaque, Error **errp) { DBMSaveState *s = &((DBMState *)opaque)->save; SaveBitmapState *dbms = NULL; if (init_dirty_bitmap_migration(s) < 0) { + error_setg(errp, + "Failed to initialize dirty tracking bitmap for blocks"); No, that's not about initializing a bitmap. This all is about migration of block-dirty-bitmaps themselves. So correct would be say "Failed to initialize migration of block dirty bitmaps". with this, for block dirty bitmap migration: Acked-by: Vladimir Sementsov-Ogievskiy I had kept your previous R-b. Should we remove it ? or is it ok if I address your comments below in a followup patch, in which case the error message above would be removed. Still, a lot better is add errp to init_dirty_bitmap_migration() and to add_bitmaps_to_list() too: look, init_dirty_bitmap_migration() fails only if add_bitmaps_to_list() fails in turn, add_bitmaps_to_list() have several clear failure points, where it always does error_report (or error_report_err), which would be better to pass-through to the user. Good idea. Will do. Thanks, C.
Re: [PATCH-for-9.1 v2 2/3] migration: Remove RDMA protocol handling
Hi Zhijian, On 29/3/24 02:53, Zhijian Li (Fujitsu) wrote: On 28/03/2024 23:01, Peter Xu wrote: On Thu, Mar 28, 2024 at 11:18:04AM -0300, Fabiano Rosas wrote: Philippe Mathieu-Daudé writes: The whole RDMA subsystem was deprecated in commit e9a54265f5 ("hw/rdma: Deprecate the pvrdma device and the rdma subsystem") released in v8.2. Remove: - RDMA handling from migration - dependencies on libibumad, libibverbs and librdmacm Keep the RAM_SAVE_FLAG_HOOK definition since it might appears in old migration streams. Cc: Peter Xu Cc: Li Zhijian Acked-by: Fabiano Rosas Signed-off-by: Philippe Mathieu-Daudé Just to be clear, because people raised the point in the last version, the first link in the deprecation commit links to a thread comprising entirely of rdma migration patches. I don't see any ambiguity on whether the deprecation was intended to include migration. There's even an ack from Juan. Yes I remember that's the plan. So on the basis of not reverting the previous maintainer's decision, my Ack stands here. We also had pretty obvious bugs ([1], [2]) in the past that would have been caught if we had any kind of testing for the feature, so I can't even say this thing works currently. @Peter Xu, @Li Zhijian, what are your thoughts on this? Generally I definitely agree with such a removal sooner or later, as that's how deprecation works, and even after Juan's left I'm not aware of any other new RDMA users. Personally, I'd slightly prefer postponing it one more release which might help a bit of our downstream maintenance, however I assume that's not a blocker either, as I think we can also manage it. IMHO it's more important to know whether there are still users and whether they would still like to see it around. That's also one thing I notice that e9a54265f533f didn't yet get acks from RDMA users that we are aware, even if they're rare. According to [2] it could be that such user may only rely on the release versions of QEMU when it broke things. So I'm copying Yu too (while Zhijian is already in the loop), just in case someone would like to stand up and speak. I admit RDMA migration was lack of testing(unit/CI test), which led to the a few obvious bugs being noticed too late. However I was a bit surprised when I saw the removal of the RDMA migration. I wasn't aware that this feature has not been marked as deprecated(at least there is no prompt to end-user). IMHO it's more important to know whether there are still users and whether they would still like to see it around. Agree. I didn't immediately express my opinion in V1 because I'm also consulting our customers for this feature in the future. Personally, I agree with Perter's idea that "I'd slightly prefer postponing it one more release which might help a bit of our downstream maintenance" Do you mind posting a deprecation patch to clarify the situation? Thanks, Phil. Thanks Zhijian Thanks, 1- https://lore.kernel.org/r/20230920090412.726725-1-lizhij...@fujitsu.com 2- https://lore.kernel.org/r/cahecvy7hxswn4ow_kog+q+tn6f_kmeichevz1qgm-fbxbpp...@mail.gmail.com
Re: [PATCH 2/2] backup: add minimum cluster size to performance options
On 08.03.24 18:51, Fiona Ebner wrote: Useful to make discard-source work in the context of backup fleecing when the fleecing image has a larger granularity than the backup target. Backup/block-copy will use at least this granularity for copy operations and in particular, discard requests to the backup source will too. If the granularity is too small, they will just be aligned down in cbw_co_pdiscard_snapshot() and thus effectively ignored. Suggested-by: Vladimir Sementsov-Ogievskiy Signed-off-by: Fiona Ebner --- block/backup.c| 2 +- block/copy-before-write.c | 2 ++ block/copy-before-write.h | 1 + blockdev.c| 3 +++ qapi/block-core.json | 9 +++-- 5 files changed, 14 insertions(+), 3 deletions(-) diff --git a/block/backup.c b/block/backup.c index 3dd2e229d2..a1292c01ec 100644 --- a/block/backup.c +++ b/block/backup.c @@ -458,7 +458,7 @@ BlockJob *backup_job_create(const char *job_id, BlockDriverState *bs, } cbw = bdrv_cbw_append(bs, target, filter_node_name, discard_source, - &bcs, errp); + perf->min_cluster_size, &bcs, errp); if (!cbw) { goto error; } diff --git a/block/copy-before-write.c b/block/copy-before-write.c index f9896c6c1e..55a9272485 100644 --- a/block/copy-before-write.c +++ b/block/copy-before-write.c @@ -545,6 +545,7 @@ BlockDriverState *bdrv_cbw_append(BlockDriverState *source, BlockDriverState *target, const char *filter_node_name, bool discard_source, + int64_t min_cluster_size, same note, suggest uint32_t BlockCopyState **bcs, Error **errp) { @@ -563,6 +564,7 @@ BlockDriverState *bdrv_cbw_append(BlockDriverState *source, } qdict_put_str(opts, "file", bdrv_get_node_name(source)); qdict_put_str(opts, "target", bdrv_get_node_name(target)); +qdict_put_int(opts, "min-cluster-size", min_cluster_size); top = bdrv_insert_node(source, opts, flags, errp); if (!top) { diff --git a/block/copy-before-write.h b/block/copy-before-write.h index 01af0cd3c4..dc6cafe7fa 100644 --- a/block/copy-before-write.h +++ b/block/copy-before-write.h @@ -40,6 +40,7 @@ BlockDriverState *bdrv_cbw_append(BlockDriverState *source, BlockDriverState *target, const char *filter_node_name, bool discard_source, + int64_t min_cluster_size, BlockCopyState **bcs, Error **errp); void bdrv_cbw_drop(BlockDriverState *bs); diff --git a/blockdev.c b/blockdev.c index daceb50460..8e6bdbc94a 100644 --- a/blockdev.c +++ b/blockdev.c @@ -2653,6 +2653,9 @@ static BlockJob *do_backup_common(BackupCommon *backup, if (backup->x_perf->has_max_chunk) { perf.max_chunk = backup->x_perf->max_chunk; } +if (backup->x_perf->has_min_cluster_size) { +perf.min_cluster_size = backup->x_perf->min_cluster_size; +} } if ((backup->sync == MIRROR_SYNC_MODE_BITMAP) || diff --git a/qapi/block-core.json b/qapi/block-core.json index 85c8f88f6e..ba0836892f 100644 --- a/qapi/block-core.json +++ b/qapi/block-core.json @@ -1551,11 +1551,16 @@ # it should not be less than job cluster size which is calculated # as maximum of target image cluster size and 64k. Default 0. # +# @min-cluster-size: Minimum size of blocks used by copy-before-write +# and background copy operations. Has to be a power of 2. No +# effect if smaller than the maximum of the target's cluster size +# and 64 KiB. Default 0. (Since 9.0) 9.1 +# # Since: 6.0 ## { 'struct': 'BackupPerf', - 'data': { '*use-copy-range': 'bool', -'*max-workers': 'int', '*max-chunk': 'int64' } } + 'data': { '*use-copy-range': 'bool', '*max-workers': 'int', +'*max-chunk': 'int64', '*min-cluster-size': 'uint32' } } ## # @BackupCommon: -- Best regards, Vladimir
Re: [PATCH 1/2] copy-before-write: allow specifying minimum cluster size
On 08.03.24 18:51, Fiona Ebner wrote: Useful to make discard-source work in the context of backup fleecing when the fleecing image has a larger granularity than the backup target. Copy-before-write operations will use at least this granularity and in particular, discard requests to the source node will too. If the granularity is too small, they will just be aligned down in cbw_co_pdiscard_snapshot() and thus effectively ignored. The QAPI uses uint32 so the value will be non-negative, but still fit into a uint64_t. Suggested-by: Vladimir Sementsov-Ogievskiy Signed-off-by: Fiona Ebner --- block/block-copy.c | 17 + block/copy-before-write.c | 3 ++- include/block/block-copy.h | 1 + qapi/block-core.json | 8 +++- 4 files changed, 23 insertions(+), 6 deletions(-) diff --git a/block/block-copy.c b/block/block-copy.c index 7e3b378528..adb1cbb440 100644 --- a/block/block-copy.c +++ b/block/block-copy.c @@ -310,6 +310,7 @@ void block_copy_set_copy_opts(BlockCopyState *s, bool use_copy_range, } static int64_t block_copy_calculate_cluster_size(BlockDriverState *target, + int64_t min_cluster_size, Maybe better use uint32_t here as well. Error **errp) { int ret; @@ -335,7 +336,7 @@ static int64_t block_copy_calculate_cluster_size(BlockDriverState *target, "used. If the actual block size of the target exceeds " "this default, the backup may be unusable", BLOCK_COPY_CLUSTER_SIZE_DEFAULT); -return BLOCK_COPY_CLUSTER_SIZE_DEFAULT; +return MAX(min_cluster_size, BLOCK_COPY_CLUSTER_SIZE_DEFAULT); } else if (ret < 0 && !target_does_cow) { error_setg_errno(errp, -ret, "Couldn't determine the cluster size of the target image, " @@ -345,16 +346,18 @@ static int64_t block_copy_calculate_cluster_size(BlockDriverState *target, return ret; } else if (ret < 0 && target_does_cow) { /* Not fatal; just trudge on ahead. */ -return BLOCK_COPY_CLUSTER_SIZE_DEFAULT; +return MAX(min_cluster_size, BLOCK_COPY_CLUSTER_SIZE_DEFAULT); } -return MAX(BLOCK_COPY_CLUSTER_SIZE_DEFAULT, bdi.cluster_size); +return MAX(min_cluster_size, + MAX(BLOCK_COPY_CLUSTER_SIZE_DEFAULT, bdi.cluster_size)); } BlockCopyState *block_copy_state_new(BdrvChild *source, BdrvChild *target, BlockDriverState *copy_bitmap_bs, const BdrvDirtyBitmap *bitmap, bool discard_source, + int64_t min_cluster_size, and here why not uint32_t Error **errp) { ERRP_GUARD(); @@ -365,7 +368,13 @@ BlockCopyState *block_copy_state_new(BdrvChild *source, BdrvChild *target, GLOBAL_STATE_CODE(); -cluster_size = block_copy_calculate_cluster_size(target->bs, errp); +if (min_cluster_size && !is_power_of_2(min_cluster_size)) { +error_setg(errp, "min-cluster-size needs to be a power of 2"); +return NULL; +} + +cluster_size = block_copy_calculate_cluster_size(target->bs, + min_cluster_size, errp); if (cluster_size < 0) { return NULL; } diff --git a/block/copy-before-write.c b/block/copy-before-write.c index dac57481c5..f9896c6c1e 100644 --- a/block/copy-before-write.c +++ b/block/copy-before-write.c @@ -476,7 +476,8 @@ static int cbw_open(BlockDriverState *bs, QDict *options, int flags, s->discard_source = flags & BDRV_O_CBW_DISCARD_SOURCE; s->bcs = block_copy_state_new(bs->file, s->target, bs, bitmap, - flags & BDRV_O_CBW_DISCARD_SOURCE, errp); + flags & BDRV_O_CBW_DISCARD_SOURCE, + opts->min_cluster_size, errp); I assume it is guaranteed to be 0 when not specified by user. if (!s->bcs) { error_prepend(errp, "Cannot create block-copy-state: "); return -EINVAL; diff --git a/include/block/block-copy.h b/include/block/block-copy.h index bdc703bacd..77857c6c68 100644 --- a/include/block/block-copy.h +++ b/include/block/block-copy.h @@ -28,6 +28,7 @@ BlockCopyState *block_copy_state_new(BdrvChild *source, BdrvChild *target, BlockDriverState *copy_bitmap_bs, const BdrvDirtyBitmap *bitmap, bool discard_source, + int64_t min_cluster_size, Error **errp); /* Function should be called prior any actual copy request */ diff --git a/qapi/block-core.json b/qapi/block-core.json index 0a72c590a8..85c8f
Re: [PATCH for-9.1 v5 07/14] migration: Add Error** argument to .save_setup() handler
On 20.03.24 09:49, Cédric Le Goater wrote: diff --git a/migration/block-dirty-bitmap.c b/migration/block-dirty-bitmap.c index 2708abf3d762de774ed294d3fdb8e56690d2974c..542a8c297b329abc30d1b3a205d29340fa59a961 100644 --- a/migration/block-dirty-bitmap.c +++ b/migration/block-dirty-bitmap.c @@ -1213,12 +1213,14 @@ fail: return ret; } -static int dirty_bitmap_save_setup(QEMUFile *f, void *opaque) +static int dirty_bitmap_save_setup(QEMUFile *f, void *opaque, Error **errp) { DBMSaveState *s = &((DBMState *)opaque)->save; SaveBitmapState *dbms = NULL; if (init_dirty_bitmap_migration(s) < 0) { +error_setg(errp, + "Failed to initialize dirty tracking bitmap for blocks"); No, that's not about initializing a bitmap. This all is about migration of block-dirty-bitmaps themselves. So correct would be say "Failed to initialize migration of block dirty bitmaps". with this, for block dirty bitmap migration: Acked-by: Vladimir Sementsov-Ogievskiy Still, a lot better is add errp to init_dirty_bitmap_migration() and to add_bitmaps_to_list() too: look, init_dirty_bitmap_migration() fails only if add_bitmaps_to_list() fails in turn, add_bitmaps_to_list() have several clear failure points, where it always does error_report (or error_report_err), which would be better to pass-through to the user. return -1; } diff --git a/migrat -- Best regards, Vladimir
Re: [PATCH v4] blockcommit: Reopen base image as RO after abort
On 28.03.24 12:16, Alexander Ivanov wrote: If a blockcommit is aborted the base image remains in RW mode, that leads to a fail of subsequent live migration. How to reproduce: $ virsh snapshot-create-as vm snp1 --disk-only *** write something to the disk inside the guest *** $ virsh blockcommit vm vda --active --shallow && virsh blockjob vm vda --abort $ lsof /vzt/vm.qcow2 COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME qemu-syst 433203 root 45u REG 253,0 1724776448 133 /vzt/vm.qcow2 $ cat /proc/433203/fdinfo/45 pos:0 flags: 02140002 < The last 2 means RW mode If the base image is in RW mode at the end of blockcommit and was in RO mode before blockcommit, reopen the base BDS in RO. Signed-off-by: Alexander Ivanov --- block/mirror.c | 11 +-- 1 file changed, 9 insertions(+), 2 deletions(-) diff --git a/block/mirror.c b/block/mirror.c index 1bdce3b657..d23be57255 100644 --- a/block/mirror.c +++ b/block/mirror.c @@ -93,6 +93,7 @@ typedef struct MirrorBlockJob { int64_t active_write_bytes_in_flight; bool prepared; bool in_drain; +bool base_ro; } MirrorBlockJob; typedef struct MirrorBDSOpaque { @@ -797,6 +798,10 @@ static int mirror_exit_common(Job *job) bdrv_drained_end(target_bs); bdrv_unref(target_bs); +if (abort && s->base_ro && !bdrv_is_read_only(target_bs)) { +bdrv_reopen_set_read_only(target_bs, true, NULL); +} + All looks good to me except this: seems it is safer to place this "if" block before "bdrv_drained_end(); bdrv_unref();" above. With it moved: Reviewed-by: Vladimir Sementsov-Ogievskiy bs_opaque->job = NULL; bdrv_drained_end(src); @@ -1717,6 +1722,7 @@ static BlockJob *mirror_start_job( bool is_none_mode, BlockDriverState *base, bool auto_complete, const char *filter_node_name, bool is_mirror, MirrorCopyMode copy_mode, + bool base_ro, Error **errp) { MirrorBlockJob *s; @@ -1800,6 +1806,7 @@ static BlockJob *mirror_start_job( bdrv_unref(mirror_top_bs); s->mirror_top_bs = mirror_top_bs; +s->base_ro = base_ro; /* No resize for the target either; while the mirror is still running, a * consistent read isn't necessarily possible. We could possibly allow @@ -2029,7 +2036,7 @@ void mirror_start(const char *job_id, BlockDriverState *bs, speed, granularity, buf_size, backing_mode, zero_target, on_source_error, on_target_error, unmap, NULL, NULL, &mirror_job_driver, is_none_mode, base, false, - filter_node_name, true, copy_mode, errp); + filter_node_name, true, copy_mode, false, errp); } BlockJob *commit_active_start(const char *job_id, BlockDriverState *bs, @@ -2058,7 +2065,7 @@ BlockJob *commit_active_start(const char *job_id, BlockDriverState *bs, on_error, on_error, true, cb, opaque, &commit_active_job_driver, false, base, auto_complete, filter_node_name, false, MIRROR_COPY_MODE_BACKGROUND, - errp); + base_read_only, errp); if (!job) { goto error_restore_flags; } -- Best regards, Vladimir
Re: [PATCH-for-9.1 v2 0/3] rdma: Remove RDMA subsystem and pvrdma device
On Thu, Mar 28, 2024 at 02:02:52PM +0100, Philippe Mathieu-Daudé wrote: > Since v1: > - split in 3 (Thomas) > - justify gluster removal Reviewed-by: Michael S. Tsirkin > Philippe Mathieu-Daudé (3): > hw/rdma: Remove pvrdma device and rdmacm-mux helper > migration: Remove RDMA protocol handling > block/gluster: Remove RDMA protocol handling > > MAINTAINERS | 17 - > docs/about/deprecated.rst |9 - > docs/about/removed-features.rst |4 + > docs/devel/migration/main.rst |6 - > docs/pvrdma.txt | 345 -- > docs/rdma.txt | 420 -- > docs/system/device-url-syntax.rst.inc |4 +- > docs/system/loongarch/virt.rst|2 +- > docs/system/qemu-block-drivers.rst.inc|1 - > meson.build | 59 - > qapi/machine.json | 17 - > qapi/migration.json | 31 +- > qapi/qapi-schema.json |1 - > qapi/rdma.json| 38 - > contrib/rdmacm-mux/rdmacm-mux.h | 61 - > hw/rdma/rdma_backend.h| 129 - > hw/rdma/rdma_backend_defs.h | 76 - > hw/rdma/rdma_rm.h | 97 - > hw/rdma/rdma_rm_defs.h| 146 - > hw/rdma/rdma_utils.h | 63 - > hw/rdma/trace.h |1 - > hw/rdma/vmw/pvrdma.h | 144 - > hw/rdma/vmw/pvrdma_dev_ring.h | 46 - > hw/rdma/vmw/pvrdma_qp_ops.h | 28 - > hw/rdma/vmw/trace.h |1 - > include/hw/rdma/rdma.h| 37 - > include/monitor/hmp.h |1 - > .../infiniband/hw/vmw_pvrdma/pvrdma_dev_api.h | 685 --- > .../infiniband/hw/vmw_pvrdma/pvrdma_verbs.h | 348 -- > .../standard-headers/rdma/vmw_pvrdma-abi.h| 310 -- > migration/migration-stats.h |6 +- > migration/migration.h |9 - > migration/options.h |2 - > migration/rdma.h | 69 - > block/gluster.c | 39 - > contrib/rdmacm-mux/main.c | 831 > hw/core/machine-qmp-cmds.c| 32 - > hw/rdma/rdma.c| 30 - > hw/rdma/rdma_backend.c| 1401 -- > hw/rdma/rdma_rm.c | 812 > hw/rdma/rdma_utils.c | 126 - > hw/rdma/vmw/pvrdma_cmd.c | 815 > hw/rdma/vmw/pvrdma_dev_ring.c | 141 - > hw/rdma/vmw/pvrdma_main.c | 735 --- > hw/rdma/vmw/pvrdma_qp_ops.c | 298 -- > migration/migration-stats.c |5 +- > migration/migration.c | 31 - > migration/options.c | 16 - > migration/qemu-file.c |1 - > migration/ram.c | 86 +- > migration/rdma.c | 4184 - > migration/savevm.c|2 +- > monitor/qmp-cmds.c|1 - > Kconfig.host |3 - > contrib/rdmacm-mux/meson.build|7 - > hmp-commands-info.hx | 13 - > hw/Kconfig|1 - > hw/meson.build|1 - > hw/rdma/Kconfig |3 - > hw/rdma/meson.build | 12 - > hw/rdma/trace-events | 31 - > hw/rdma/vmw/trace-events | 17 - > meson_options.txt |4 - > migration/meson.build |1 - > migration/trace-events| 68 +- > qapi/meson.build |1 - > qemu-options.hx |6 - > .../org.centos/stream/8/build-environment.yml |1 - > .../ci/org.centos/stream/8/x86_64/configure |3 - > scripts/ci/setup/build-environment.yml|4 - > scripts/coverity-scan/run-coverity-scan |2 +- > scripts/meson-buildoptions.sh |6 - > scripts/update-linux-headers.sh | 27 - > tests/lcitool/projects/qemu.yml |3 - > tests/migration/guestperf/engine.py |4 +- > 75 files changed, 20 insertions(+), 12997 deletions(-) > delete mode 100644 docs/pvrdma.txt > delete mode 100644 docs/rdma.txt > delete mode 100644 qapi/rdma.json > delete mode 100644 contrib/rdmacm-mux/rdmacm-mux.h > delete mode 100644
Re: [PATCH 06/19] block/stream: fix -Werror=maybe-uninitialized false-positives
On 28.03.24 13:20, marcandre.lur...@redhat.com wrote: From: Marc-André Lureau ../block/stream.c:193:19: error: ‘unfiltered_bs’ may be used uninitialized [-Werror=maybe-uninitialized] ../block/stream.c:176:5: error: ‘len’ may be used uninitialized [-Werror=maybe-uninitialized] trace/trace-block.h:906:9: error: ‘ret’ may be used uninitialized [-Werror=maybe-uninitialized] Signed-off-by: Marc-André Lureau Again, same false-positives, because of WITH_GRAPH_RDLOCK_GUARD().. Didn't you try to change WITH_ macros somehow, so that compiler believe in our good intentions? Actually, "unused variable initialization" is bad thing too. Anyway, if no better solution for now: Acked-by: Vladimir Sementsov-Ogievskiy --- block/stream.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/block/stream.c b/block/stream.c index 7031eef12b..9076203193 100644 --- a/block/stream.c +++ b/block/stream.c @@ -155,8 +155,8 @@ static void stream_clean(Job *job) static int coroutine_fn stream_run(Job *job, Error **errp) { StreamBlockJob *s = container_of(job, StreamBlockJob, common.job); -BlockDriverState *unfiltered_bs; -int64_t len; +BlockDriverState *unfiltered_bs = NULL; +int64_t len = -1; int64_t offset = 0; int error = 0; int64_t n = 0; /* bytes */ @@ -177,7 +177,7 @@ static int coroutine_fn stream_run(Job *job, Error **errp) for ( ; offset < len; offset += n) { bool copy; -int ret; +int ret = -1; /* Note that even when no rate limit is applied we need to yield * with no pending I/O here so that bdrv_drain_all() returns. -- Best regards, Vladimir
Re: [PATCH 05/19] block/mirror: fix -Werror=maybe-uninitialized false-positive
On 28.03.24 13:20, marcandre.lur...@redhat.com wrote: From: Marc-André Lureau ../block/mirror.c:1066:22: error: ‘iostatus’ may be used uninitialized [-Werror=maybe-uninitialized] Actually that's a false-positive.. Compiler can't believe that body of WITH_JOB_LOCK_GUARD() will be executed unconditionally. Probably we should mention this in a comment. Signed-off-by: Marc-André Lureau Reviewed-by: Vladimir Sementsov-Ogievskiy --- block/mirror.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/block/mirror.c b/block/mirror.c index 1bdce3b657..53dd7332ee 100644 --- a/block/mirror.c +++ b/block/mirror.c @@ -926,7 +926,7 @@ static int coroutine_fn mirror_run(Job *job, Error **errp) MirrorBDSOpaque *mirror_top_opaque = s->mirror_top_bs->opaque; BlockDriverState *target_bs = blk_bs(s->target); bool need_drain = true; -BlockDeviceIoStatus iostatus; +BlockDeviceIoStatus iostatus = BLOCK_DEVICE_IO_STATUS__MAX; int64_t length; int64_t target_length; BlockDriverInfo bdi; -- Best regards, Vladimir