"Maciej S. Szmigiero" <m...@maciej.szmigiero.name> writes: > On 29.08.2024 02:41, Fabiano Rosas wrote: >> "Maciej S. Szmigiero" <m...@maciej.szmigiero.name> writes: >> >>> From: "Maciej S. Szmigiero" <maciej.szmigi...@oracle.com> >>> >>> A new function multifd_queue_device_state() is provided for device to queue >>> its state for transmission via a multifd channel. >>> >>> Signed-off-by: Maciej S. Szmigiero <maciej.szmigi...@oracle.com> >>> --- >>> include/migration/misc.h | 4 ++ >>> migration/meson.build | 1 + >>> migration/multifd-device-state.c | 99 ++++++++++++++++++++++++++++++++ >>> migration/multifd-nocomp.c | 6 +- >>> migration/multifd-qpl.c | 2 +- >>> migration/multifd-uadk.c | 2 +- >>> migration/multifd-zlib.c | 2 +- >>> migration/multifd-zstd.c | 2 +- >>> migration/multifd.c | 65 +++++++++++++++------ >>> migration/multifd.h | 29 +++++++++- >>> 10 files changed, 184 insertions(+), 28 deletions(-) >>> create mode 100644 migration/multifd-device-state.c >>> >>> diff --git a/include/migration/misc.h b/include/migration/misc.h >>> index bfadc5613bac..7266b1b77d1f 100644 >>> --- a/include/migration/misc.h >>> +++ b/include/migration/misc.h >>> @@ -111,4 +111,8 @@ bool migration_in_bg_snapshot(void); >>> /* migration/block-dirty-bitmap.c */ >>> void dirty_bitmap_mig_init(void); >>> >>> +/* migration/multifd-device-state.c */ >>> +bool multifd_queue_device_state(char *idstr, uint32_t instance_id, >>> + char *data, size_t len); >>> + >>> #endif >>> diff --git a/migration/meson.build b/migration/meson.build >>> index 77f3abf08eb1..00853595894f 100644 >>> --- a/migration/meson.build >>> +++ b/migration/meson.build >>> @@ -21,6 +21,7 @@ system_ss.add(files( >>> 'migration-hmp-cmds.c', >>> 'migration.c', >>> 'multifd.c', >>> + 'multifd-device-state.c', >>> 'multifd-nocomp.c', >>> 'multifd-zlib.c', >>> 'multifd-zero-page.c', >>> diff --git a/migration/multifd-device-state.c >>> b/migration/multifd-device-state.c >>> new file mode 100644 >>> index 000000000000..c9b44f0b5ab9 >>> --- /dev/null >>> +++ b/migration/multifd-device-state.c >>> @@ -0,0 +1,99 @@ >>> +/* >>> + * Multifd device state migration >>> + * >>> + * Copyright (C) 2024 Oracle and/or its affiliates. >>> + * >>> + * This work is licensed under the terms of the GNU GPL, version 2 or >>> later. >>> + * See the COPYING file in the top-level directory. >>> + */ >>> + >>> +#include "qemu/osdep.h" >>> +#include "qemu/lockable.h" >>> +#include "migration/misc.h" >>> +#include "multifd.h" >>> + >>> +static QemuMutex queue_job_mutex; >>> + >>> +static MultiFDSendData *device_state_send; >>> + >>> +size_t multifd_device_state_payload_size(void) >>> +{ >>> + return sizeof(MultiFDDeviceState_t); >>> +} >> >> This will not be necessary because the payload size is the same as the >> data type. We only need it for the special case where the MultiFDPages_t >> is smaller than the total ram payload size. > > I know - I just wanted to make the API consistent with the one RAM > handler provides since these multifd_send_data_alloc() calls are done > just once per migration - it isn't any kind of a hot path. >
I think the array at the end of MultiFDPages_t should be considered enough of a hack that we might want to keep anything related to it outside of the interface. Other clients shouldn't have to think about that at all. >>> @@ -397,20 +404,16 @@ bool multifd_send(MultiFDSendData **send_data) >>> >>> p = &multifd_send_state->params[i]; >>> /* >>> - * Lockless read to p->pending_job is safe, because only multifd >>> - * sender thread can clear it. >>> + * Lockless RMW on p->pending_job_preparing is safe, because only >>> multifd >>> + * sender thread can clear it after it had seen p->pending_job >>> being set. >>> + * >>> + * Pairs with qatomic_store_release() in multifd_send_thread(). >>> */ >>> - if (qatomic_read(&p->pending_job) == false) { >>> + if (qatomic_cmpxchg(&p->pending_job_preparing, false, true) == >>> false) { >> >> What's the motivation for this change? It would be better to have it in >> a separate patch with a proper justification. > > The original RFC patch set used dedicated device state multifd channels. > > Peter and other people wanted this functionality removed, however this caused > a performance (downtime) regression. > > One of the things that seemed to help mitigate this regression was making > the multifd channel selection more fair via this change. > > But I can split out it to a separate commit in the next patch set version and > then see what performance improvement it currently brings. Yes, better to have it separate if anything for documentation of the rationale.