"Maciej S. Szmigiero" <m...@maciej.szmigiero.name> writes:

> On 29.08.2024 02:41, Fabiano Rosas wrote:
>> "Maciej S. Szmigiero" <m...@maciej.szmigiero.name> writes:
>> 
>>> From: "Maciej S. Szmigiero" <maciej.szmigi...@oracle.com>
>>>
>>> A new function multifd_queue_device_state() is provided for device to queue
>>> its state for transmission via a multifd channel.
>>>
>>> Signed-off-by: Maciej S. Szmigiero <maciej.szmigi...@oracle.com>
>>> ---
>>>   include/migration/misc.h         |  4 ++
>>>   migration/meson.build            |  1 +
>>>   migration/multifd-device-state.c | 99 ++++++++++++++++++++++++++++++++
>>>   migration/multifd-nocomp.c       |  6 +-
>>>   migration/multifd-qpl.c          |  2 +-
>>>   migration/multifd-uadk.c         |  2 +-
>>>   migration/multifd-zlib.c         |  2 +-
>>>   migration/multifd-zstd.c         |  2 +-
>>>   migration/multifd.c              | 65 +++++++++++++++------
>>>   migration/multifd.h              | 29 +++++++++-
>>>   10 files changed, 184 insertions(+), 28 deletions(-)
>>>   create mode 100644 migration/multifd-device-state.c
>>>
>>> diff --git a/include/migration/misc.h b/include/migration/misc.h
>>> index bfadc5613bac..7266b1b77d1f 100644
>>> --- a/include/migration/misc.h
>>> +++ b/include/migration/misc.h
>>> @@ -111,4 +111,8 @@ bool migration_in_bg_snapshot(void);
>>>   /* migration/block-dirty-bitmap.c */
>>>   void dirty_bitmap_mig_init(void);
>>>   
>>> +/* migration/multifd-device-state.c */
>>> +bool multifd_queue_device_state(char *idstr, uint32_t instance_id,
>>> +                                char *data, size_t len);
>>> +
>>>   #endif
>>> diff --git a/migration/meson.build b/migration/meson.build
>>> index 77f3abf08eb1..00853595894f 100644
>>> --- a/migration/meson.build
>>> +++ b/migration/meson.build
>>> @@ -21,6 +21,7 @@ system_ss.add(files(
>>>     'migration-hmp-cmds.c',
>>>     'migration.c',
>>>     'multifd.c',
>>> +  'multifd-device-state.c',
>>>     'multifd-nocomp.c',
>>>     'multifd-zlib.c',
>>>     'multifd-zero-page.c',
>>> diff --git a/migration/multifd-device-state.c 
>>> b/migration/multifd-device-state.c
>>> new file mode 100644
>>> index 000000000000..c9b44f0b5ab9
>>> --- /dev/null
>>> +++ b/migration/multifd-device-state.c
>>> @@ -0,0 +1,99 @@
>>> +/*
>>> + * Multifd device state migration
>>> + *
>>> + * Copyright (C) 2024 Oracle and/or its affiliates.
>>> + *
>>> + * This work is licensed under the terms of the GNU GPL, version 2 or 
>>> later.
>>> + * See the COPYING file in the top-level directory.
>>> + */
>>> +
>>> +#include "qemu/osdep.h"
>>> +#include "qemu/lockable.h"
>>> +#include "migration/misc.h"
>>> +#include "multifd.h"
>>> +
>>> +static QemuMutex queue_job_mutex;
>>> +
>>> +static MultiFDSendData *device_state_send;
>>> +
>>> +size_t multifd_device_state_payload_size(void)
>>> +{
>>> +    return sizeof(MultiFDDeviceState_t);
>>> +}
>> 
>> This will not be necessary because the payload size is the same as the
>> data type. We only need it for the special case where the MultiFDPages_t
>> is smaller than the total ram payload size.
>
> I know - I just wanted to make the API consistent with the one RAM
> handler provides since these multifd_send_data_alloc() calls are done
> just once per migration - it isn't any kind of a hot path.
>

I think the array at the end of MultiFDPages_t should be considered
enough of a hack that we might want to keep anything related to it
outside of the interface. Other clients shouldn't have to think about
that at all.

>>> @@ -397,20 +404,16 @@ bool multifd_send(MultiFDSendData **send_data)
>>>   
>>>           p = &multifd_send_state->params[i];
>>>           /*
>>> -         * Lockless read to p->pending_job is safe, because only multifd
>>> -         * sender thread can clear it.
>>> +         * Lockless RMW on p->pending_job_preparing is safe, because only 
>>> multifd
>>> +         * sender thread can clear it after it had seen p->pending_job 
>>> being set.
>>> +         *
>>> +         * Pairs with qatomic_store_release() in multifd_send_thread().
>>>            */
>>> -        if (qatomic_read(&p->pending_job) == false) {
>>> +        if (qatomic_cmpxchg(&p->pending_job_preparing, false, true) == 
>>> false) {
>> 
>> What's the motivation for this change? It would be better to have it in
>> a separate patch with a proper justification.
>
> The original RFC patch set used dedicated device state multifd channels.
>
> Peter and other people wanted this functionality removed, however this caused
> a performance (downtime) regression.
>
> One of the things that seemed to help mitigate this regression was making
> the multifd channel selection more fair via this change.
>
> But I can split out it to a separate commit in the next patch set version and
> then see what performance improvement it currently brings.

Yes, better to have it separate if anything for documentation of the
rationale.


Reply via email to