Am 16.06.2020 um 22:01 hat Nir Soffer geschrieben: > On Tue, Jun 16, 2020 at 8:39 PM Nir Soffer <nsof...@redhat.com> wrote: > > > > On Tue, Jun 16, 2020 at 6:32 PM Kevin Wolf <kw...@redhat.com> wrote: > > > > > > Am 15.06.2020 um 21:32 hat Nir Soffer geschrieben: > > > > We can zero 2.3 g/s: > > > > > > > > # time blkdiscard -z test-lv > > > > > > > > real 0m43.902s > > > > user 0m0.002s > > > > sys 0m0.130s > > > > > > > We can write 445m/s: > > > > > > > > # dd if=/dev/zero bs=2M count=51200 of=test-lv oflag=direct conv=fsync > > > > 107374182400 bytes (107 GB, 100 GiB) copied, 241.257 s, 445 MB/s > > > > > > So using FALLOC_FL_PUNCH_HOLE _is_ faster after all. What might not be > > > faster is zeroing out the whole device and then overwriting a > > > considerable part of it again. > > > > > > I think this means that we shouldn't fail write_zeroes at the file-posix > > > level even if BDRV_REQ_NO_FALLBACK is given. Instead, qemu-img convert > > > is where I see a fix. > > > > > > Certainly qemu-img could be cleverer and zero out more selectively. The > > > idea of doing a blk_make_zero() first seems to have caused some > > > problems, though of course its introduction was also justified with > > > performance, so improving one case might hurt another if we're not > > > careful. > > > > > > However, when Peter Lieven introduced this (commit 5a37b60a61c), we > > > didn't use write_zeroes yet during the regular copy loop (we do since > > > commit 690c7301600). So chances are that blk_make_zero() doesn't > > > actually help any more now. > > > > > > Can you run another test with the patch below? > > > > Sure, I can try this. > > Tried it, and it performs the same as expected.
Thanks. > > > I think it should perform > > > the same as yours. Eric, Peter, do you think this would have a negative > > > effect for NBD and/or iscsi? > > > > > > The other option would be providing an option and making it Someone > > > Else's Problem. > > > > > > Kevin > > > > > > > > > diff --git a/qemu-img.c b/qemu-img.c > > > index d7e846e607..bdb9f6aa46 100644 > > > --- a/qemu-img.c > > > +++ b/qemu-img.c > > > @@ -2084,15 +2084,6 @@ static int convert_do_copy(ImgConvertState *s) > > > s->has_zero_init = bdrv_has_zero_init(blk_bs(s->target)); > > > } > > > > > > - if (!s->has_zero_init && !s->target_has_backing && > > > - bdrv_can_write_zeroes_with_unmap(blk_bs(s->target))) > > > - { > > > - ret = blk_make_zero(s->target, BDRV_REQ_MAY_UNMAP | > > > BDRV_REQ_NO_FALLBACK); > > > - if (ret == 0) { > > > - s->has_zero_init = true; > > > - } > > > - } > > > > This will work of course, but now we will not do bulk zero for any target > > I would like to have a minimal change to increase the chance that we > can consume this very soon in oVirt. I think this one would be pretty minimal. Maybe we can later bring this code back, but with an implementation of blk_make_zero() that doesn't use the generic write_zeroes operation, but with a specific callback like Eric suggested. > > I think we never do this for regular files anyway since we truncate > > files, and there is nothing to zero, but maybe there is some case > > when this is useful? Yes, regular files have s->has_zero_init == true anyway. > > BTW, do we use BDRV_REQ_NO_FALLBACK elsewhere? maybe we can remove > > it. qcow2 uses it when zeroing out parts of a newly allocated cluster on partial writes. Though that code is questionable, too, and XFS people suggest that we should stop using it. Kevin