On 05/13/2015 02:40 PM, Fam Zheng wrote: > On Wed, 05/13 13:17, Wen Congyang wrote: >> On 05/13/2015 11:11 AM, Fam Zheng wrote: >>> Before, we only yield after initializing dirty bitmap, where the QMP >>> command would return. That may take very long, and guest IO will be >>> blocked. >> >> Do you have such case to reproduce it? If the disk image is too larger, >> and I think qemu doesn't cache all metedata in the memory. So we will >> yield in bdrv_is_allocated_above() when we read the metedata from the >> disk. > > True for qcow2, but raw-posix has no such yield points, because it uses > lseek(..., SEEK_HOLE). I do have a reproducer - just try a big raw image on > your ext4.
It is the filesystem's problem. If we mirror a big empty raw image, lseek(..., SEEK_DATA) may needs some seconds(about 5s for 500G empty raw image). Even if the granularity is 64K, we need to call this syscall 8192000(500G/64K) times. We may need more than half year... So I think we should allow bdrv_is_allocated() and other APIs to return a larger p_num than nb_sectors. Thanks Wen Congyang > > Fam > >> >> Thanks >> Wen Congyang >> >>> >>> Add sleep points like the later mirror iterations. >>> >>> Signed-off-by: Fam Zheng <f...@redhat.com> >>> --- >>> block/mirror.c | 13 ++++++++++++- >>> 1 file changed, 12 insertions(+), 1 deletion(-) >>> >>> diff --git a/block/mirror.c b/block/mirror.c >>> index 1a1d997..baed225 100644 >>> --- a/block/mirror.c >>> +++ b/block/mirror.c >>> @@ -467,11 +467,23 @@ static void coroutine_fn mirror_run(void *opaque) >>> sectors_per_chunk = s->granularity >> BDRV_SECTOR_BITS; >>> mirror_free_init(s); >>> >>> + last_pause_ns = qemu_clock_get_ns(QEMU_CLOCK_REALTIME); >>> if (!s->is_none_mode) { >>> /* First part, loop on the sectors and initialize the dirty >>> bitmap. */ >>> BlockDriverState *base = s->base; >>> for (sector_num = 0; sector_num < end; ) { >>> int64_t next = (sector_num | (sectors_per_chunk - 1)) + 1; >>> + int64_t now = qemu_clock_get_ns(QEMU_CLOCK_REALTIME); >>> + >>> + if (now - last_pause_ns > SLICE_TIME) { >>> + last_pause_ns = now; >>> + block_job_sleep_ns(&s->common, QEMU_CLOCK_REALTIME, 0); >>> + } >>> + >>> + if (block_job_is_cancelled(&s->common)) { >>> + goto immediate_exit; >>> + } >>> + >>> ret = bdrv_is_allocated_above(bs, base, >>> sector_num, next - sector_num, >>> &n); >>> >>> @@ -490,7 +502,6 @@ static void coroutine_fn mirror_run(void *opaque) >>> } >>> >>> bdrv_dirty_iter_init(s->dirty_bitmap, &s->hbi); >>> - last_pause_ns = qemu_clock_get_ns(QEMU_CLOCK_REALTIME); >>> for (;;) { >>> uint64_t delay_ns = 0; >>> int64_t cnt; >>> >> > . >