On Tue, Sep 15, 2020 at 2:51 PM Stefan Reiter <s.rei...@proxmox.com> wrote: > > On 9/15/20 11:08 AM, Nir Soffer wrote: > > On Mon, Sep 14, 2020 at 3:25 PM Stefan Reiter <s.rei...@proxmox.com> wrote: > >> > >> Hi list, > >> > >> following command fails since 5.1 (tested on kernel 5.4.60): > >> > >> # qemu-img convert -p -f raw -O raw /dev/zvol/pool/disk-1 /dev/vg/disk-1 > >> qemu-img: error while writing at byte 2157968896: Device or resource busy > >> > >> (source is ZFS here, but doesn't matter in practice, it always fails the > >> same; offset changes slightly but consistently hovers around 2^31) > >> > >> strace shows the following: > >> fallocate(13, FALLOC_FL_KEEP_SIZE|FALLOC_FL_PUNCH_HOLE, 2157968896, > >> 4608) = -1 EBUSY (Device or resource busy) > > > > What is the size of the LV? > > > > Same as the source, 5GB in my test case. Created with: > > # lvcreate -ay --size 5242880k --name disk-1 vg > > > Does it happen if you change sparse minimum size (-S)? > > > > For example: -S 64k > > > > qemu-img convert -p -f raw -O raw -S 64k /dev/zvol/pool/disk-1 > > /dev/vg/disk-1 > > > > Tried a few different values, always the same result: EBUSY at byte > 2157968896. > > >> Other fallocate calls leading up to this work fine. > >> > >> This happens since commit edafc70c0c "qemu-img convert: Don't pre-zero > >> images", before that all fallocates happened at the start. Reverting the > >> commit and calling qemu-img exactly the same way on the same data works > >> fine. > > > > But slowly, doing up to 100% more work for fully allocated images. > > > > Of course, I'm not saying the patch is wrong, reverting it just avoids > triggering the bug. > > >> Simply retrying the syscall on EBUSY (like EINTR) does *not* work, > >> once it fails it keeps failing with the same error. > >> > >> I couldn't find anything related to EBUSY on fallocate, and it only > >> happens on LVM targets... Any idea or pointers where to look? > > > > Is this thin LV? > > > > No, regular LV. See command above. > > > This works for us using regular LVs. > > > > Which kernel? which distro? > > > > Reproducible on: > * PVE w/ kernel 5.4.60 (Ubuntu based) > * Manjaro w/ kernel 5.8.6 > > I found that it does not happen with all images, I suppose there must be > a certain number of smaller holes for it to happen. I am using a VM > image with a bare-bones Alpine Linux installation, but it's not an > isolated case, we've had two people report the issue on our bug tracker: > https://bugzilla.proxmox.com/show_bug.cgi?id=3002
I think that this issue may be fixed by https://lists.nongnu.org/archive/html/qemu-block/2020-11/msg00358.html Nir