On Mon, May 18, 2026 at 02:17:58AM +0000, Zhijian Li (Fujitsu) wrote: > Samuel, > > > Thanks for the patch. > > On 14/05/2026 11:18, Samuel Zhang wrote: > > The 1MiB default dates back to the original RDMA implementation in > > 2013 (commit 2da776db48), and is too conservative for modern hardware. > > > > 64MiB captures most of the throughput gain (~10x over 1MiB) while > > keeping transferred data low. Larger chunks cause more data to be > > retransferred per dirty page, so the largest chunk size is not > > necessarily optimal (see 1024MiB row). The x-rdma-chunk-size > > parameter remains available for user tuning. > > > > Test config: BlueField-3 ConnectX-7, 8GB VM RAM, pin-all off, > > `stress-ng --vm 4 --vm-bytes 1G --vm-method rand-set` > > > > chunk_size total(ms) down(ms) Throughput(Mbps) transferred > > 1m 45,156 1,166 1,252.50 6.46 GiB > > 32m 15,034 1,864 3,401.26 5.57 GiB > > 64m 4,492 1,554 13,637.46 5.75 GiB > > 128m 3,940 1,662 16,860.59 6.06 GiB > > 1024m 3,665 2,238 24,676.59 8.04 GiB > > > > Signed-off-by: Samuel Zhang <[email protected]> > > --- > > migration/options.c | 2 +- > > qapi/migration.json | 2 +- > > 2 files changed, 2 insertions(+), 2 deletions(-) > > > > diff --git a/migration/options.c b/migration/options.c > > index 5cbfd29099..ea2137372c 100644 > > --- a/migration/options.c > > +++ b/migration/options.c > > @@ -91,7 +91,7 @@ const PropertyInfo qdev_prop_StrOrNull; > > > > #define DEFAULT_MIGRATE_VCPU_DIRTY_LIMIT_PERIOD 1000 /* > > milliseconds */ > > #define DEFAULT_MIGRATE_VCPU_DIRTY_LIMIT 1 /* MB/s */ > > -#define DEFAULT_MIGRATE_X_RDMA_CHUNK_SIZE MiB > > +#define DEFAULT_MIGRATE_X_RDMA_CHUNK_SIZE (64 * MiB) > > > I have a concern about backward compatibility. > > AFAIK, changing the default chunk size could break RDMA migration between > hosts running different QEMU versions. > If this happens, the error message is not clear enough for a user to > understand that the failure > is due to a mismatch in 'x-rdma-chunk-size'?
Oh that's rather unfortunate. Even though x-rdma-chunk-size is marked experimental/unstable, we can't change its default value if it breaks migration compatibility out of the box :-( Libvirt (or equivalent) would need to negotiate the chunk size to a larger value, which would mean we need to declare x-rdma-chunk-size stable by removing the x- prefix. With regards, Daniel -- |: https://berrange.com ~~ https://hachyderm.io/@berrange :| |: https://libvirt.org ~~ https://entangle-photo.org :| |: https://pixelfed.art/berrange ~~ https://fstop138.berrange.com :|
