On Mon, May 18, 2026 at 02:17:58AM +0000, Zhijian Li (Fujitsu) wrote:
> Samuel,
> 
> 
> Thanks for the patch.
> 
> On 14/05/2026 11:18, Samuel Zhang wrote:
> > The 1MiB default dates back to the original RDMA implementation in
> > 2013 (commit 2da776db48), and is too conservative for modern hardware.
> > 
> > 64MiB captures most of the throughput gain (~10x over 1MiB) while
> > keeping transferred data low.  Larger chunks cause more data to be
> > retransferred per dirty page, so the largest chunk size is not
> > necessarily optimal (see 1024MiB row).  The x-rdma-chunk-size
> > parameter remains available for user tuning.
> > 
> > Test config: BlueField-3 ConnectX-7, 8GB VM RAM, pin-all off,
> >    `stress-ng --vm 4 --vm-bytes 1G --vm-method rand-set`
> > 
> > chunk_size  total(ms)  down(ms)  Throughput(Mbps)  transferred
> > 1m            45,156    1,166          1,252.50     6.46 GiB
> > 32m           15,034    1,864          3,401.26     5.57 GiB
> > 64m            4,492    1,554         13,637.46     5.75 GiB
> > 128m           3,940    1,662         16,860.59     6.06 GiB
> > 1024m          3,665    2,238         24,676.59     8.04 GiB
> > 
> > Signed-off-by: Samuel Zhang <[email protected]>
> > ---
> >   migration/options.c | 2 +-
> >   qapi/migration.json | 2 +-
> >   2 files changed, 2 insertions(+), 2 deletions(-)
> > 
> > diff --git a/migration/options.c b/migration/options.c
> > index 5cbfd29099..ea2137372c 100644
> > --- a/migration/options.c
> > +++ b/migration/options.c
> > @@ -91,7 +91,7 @@ const PropertyInfo qdev_prop_StrOrNull;
> >   
> >   #define DEFAULT_MIGRATE_VCPU_DIRTY_LIMIT_PERIOD     1000    /* 
> > milliseconds */
> >   #define DEFAULT_MIGRATE_VCPU_DIRTY_LIMIT            1       /* MB/s */
> > -#define DEFAULT_MIGRATE_X_RDMA_CHUNK_SIZE           MiB
> > +#define DEFAULT_MIGRATE_X_RDMA_CHUNK_SIZE           (64 * MiB)
> 
> 
> I have a concern about backward compatibility.
> 
> AFAIK, changing the default chunk size could break RDMA migration between 
> hosts running different QEMU versions.
> If this happens, the error message is not clear enough for a user to 
> understand that the failure
> is due to a mismatch in 'x-rdma-chunk-size'?

Oh that's rather unfortunate. Even though x-rdma-chunk-size is marked
experimental/unstable, we can't change its default value if it breaks
migration compatibility out of the box :-(  Libvirt (or equivalent)
would need to negotiate the chunk size to a larger value, which would
mean we need to declare x-rdma-chunk-size stable by removing the x-
prefix.

With regards,
Daniel
-- 
|: https://berrange.com       ~~        https://hachyderm.io/@berrange :|
|: https://libvirt.org          ~~          https://entangle-photo.org :|
|: https://pixelfed.art/berrange   ~~    https://fstop138.berrange.com :|


Reply via email to