On Sat, 04/08 21:29, 858585 jemmy wrote: > On Sat, Apr 8, 2017 at 12:52 PM, 858585 jemmy <jemmy858...@gmail.com> wrote: > > On Fri, Apr 7, 2017 at 6:10 PM, Fam Zheng <f...@redhat.com> wrote: > >> On Fri, 04/07 16:44, jemmy858...@gmail.com wrote: > >>> From: Lidong Chen <lidongc...@tencent.com> > >>> > >>> BLOCK_SIZE is (1 << 20), qcow2 cluster size is 65536 by default, > >>> this maybe cause the qcow2 file size is bigger after migration. > >>> This patch check each cluster, use blk_pwrite_zeroes for each > >>> zero cluster. > >>> > >>> Signed-off-by: Lidong Chen <lidongc...@tencent.com> > >>> --- > >>> migration/block.c | 37 +++++++++++++++++++++++++++++++++++-- > >>> 1 file changed, 35 insertions(+), 2 deletions(-) > >>> > >>> diff --git a/migration/block.c b/migration/block.c > >>> index 7734ff7..c32e046 100644 > >>> --- a/migration/block.c > >>> +++ b/migration/block.c > >>> @@ -885,6 +885,11 @@ static int block_load(QEMUFile *f, void *opaque, int > >>> version_id) > >>> int64_t total_sectors = 0; > >>> int nr_sectors; > >>> int ret; > >>> + int i; > >>> + int64_t addr_offset; > >>> + uint8_t *buf_offset; > >> > >> Poor variable names, they are not offset, maybe "cur_addr" and "cur_buf"? > >> And > >> they can be moved to the loop block below. > > ok, i will change. > > > >> > >>> + BlockDriverInfo bdi; > >>> + int cluster_size; > >>> > >>> do { > >>> addr = qemu_get_be64(f); > >>> @@ -934,8 +939,36 @@ static int block_load(QEMUFile *f, void *opaque, int > >>> version_id) > >>> } else { > >>> buf = g_malloc(BLOCK_SIZE); > >>> qemu_get_buffer(f, buf, BLOCK_SIZE); > >>> - ret = blk_pwrite(blk, addr * BDRV_SECTOR_SIZE, buf, > >>> - nr_sectors * BDRV_SECTOR_SIZE, 0); > >>> + > >>> + ret = bdrv_get_info(blk_bs(blk), &bdi); > >>> + cluster_size = bdi.cluster_size; > >>> + > >>> + if (ret == 0 && cluster_size > 0 && > >>> + cluster_size < BLOCK_SIZE && > >> > >> I think cluster_size == BLOCK_SIZE should work too. > > This case the (flags & BLK_MIG_FLAG_ZERO_BLOCK) should be true, > > and will invoke blk_pwrite_zeroes before apply this patch. > > but maybe the source qemu maybe not enabled zero flag. > > so i think cluster_size <= BLOCK_SIZE is ok. > > > >> > >>> + BLOCK_SIZE % cluster_size == 0) { > >>> + for (i = 0; i < BLOCK_SIZE / cluster_size; i++) { > >>> + addr_offset = addr * BDRV_SECTOR_SIZE > >>> + + i * cluster_size; > >>> + buf_offset = buf + i * cluster_size; > >>> + > >>> + if (buffer_is_zero(buf_offset, cluster_size)) { > >>> + ret = blk_pwrite_zeroes(blk, addr_offset, > >>> + cluster_size, > >>> + BDRV_REQ_MAY_UNMAP); > >>> + } else { > >>> + ret = blk_pwrite(blk, addr_offset, > >>> buf_offset, > >>> + cluster_size, 0); > >>> + } > >>> + > >>> + if (ret < 0) { > >>> + g_free(buf); > >>> + return ret; > >>> + } > >>> + } > >>> + } else { > >>> + ret = blk_pwrite(blk, addr * BDRV_SECTOR_SIZE, buf, > >>> + nr_sectors * BDRV_SECTOR_SIZE, 0); > >>> + } > >>> g_free(buf); > >>> } > >>> > >>> -- > >>> 1.8.3.1 > >>> > >> > >> Is it possible use (source) cluster size as the transfer chunk size, > >> instead of > >> BDRV_SECTORS_PER_DIRTY_CHUNK? Then the existing BLK_MIG_FLAG_ZERO_BLOCK > >> logic > >> can help and you don't need to send zero bytes on the wire. This may still > >> not > >> be optimal if dest has larger cluster, but it should cover the common use > >> case > >> well. > > > > yes, i also think BDRV_SECTORS_PER_DIRTY_CHUNK is too large. > > This have two disadvantage: > > 1. it will cause the dest qcow2 file size is bigger after migration. > > 2. it will cause transfer not necessary data, and maybe cause the > > migration can't be successful. > > > > in my production environment, some vm only write 2MB/s, the dirty > > block migrate speed is 70MB/s. > > but it still migration timeout. > > > > but if we change the size of BDRV_SECTORS_PER_DIRTY_CHUNK, it will > > break the protocol. > > the old version qemu will not be able to migrate to new version qemu. > > there are not information about the length about the migration buffer. > > > > so i think we should add new flags to indicate that there are an > > additional byte about the length > > of migration buffer. i will send another patch later, and test the result. > > Hi Fam: > Do we need consider the circumstances than migrate from new qemu version > to old qemu version?
Yes, usually we use a subsection to achieve that - missing the "chunk size" field should result in using the old BDRV_SECTORS_PER_DIRTY_CHUNK value. Fam > > > > > this patch is also valuable, there are many old version qemu in my > > production environment. > > and will be benefit with this patch. > > > >> > >> Fam >