Re: [Qemu-devel] TRIM/DISCARD/UNMAP support on qemu-nbd
Il 08/01/2014 23:53, Richard W.M. Jones ha scritto: > On Wed, Jan 08, 2014 at 11:45:39PM +0100, Paolo Bonzini wrote: >> Il 08/01/2014 23:24, Richard W.M. Jones ha scritto: >>> It's extremely difficult to know when it's safe to add this parameter. >>> Qemu gives no indication of when using discard=.. is safe (ie. won't >>> cause qemu to fail to start up or fail in some other way). It's even >>> worse when we have to go via libvirt which itself doesn't expose >>> qemu's capabilities upwards. >> >> It is a bug that "-help" doesn't list discard=on, but QMP >> query-command-line-options lists it correctly. >> >> libvirt could safely ignore discard if the underlying QEMU does not >> support it, but that's not how it was implemented. Currently, >> explicitly specifying either discard='on' and discard='off' will cause >> the VM to fail to start if QEMU does not support it. There are >> tradeoffs in both solutions... > > That sucks .. for me ... > > Can't we have an option like discard=ifpossible? libguestfs would use > this, since we'd always prefer to honour discard requests from our > kernel. That would be a libvirt option, not QEMU. Paolo
Re: [Qemu-devel] TRIM/DISCARD/UNMAP support on qemu-nbd
On Wed, Jan 08, 2014 at 11:45:39PM +0100, Paolo Bonzini wrote: > Il 08/01/2014 23:24, Richard W.M. Jones ha scritto: > > It's extremely difficult to know when it's safe to add this parameter. > > Qemu gives no indication of when using discard=.. is safe (ie. won't > > cause qemu to fail to start up or fail in some other way). It's even > > worse when we have to go via libvirt which itself doesn't expose > > qemu's capabilities upwards. > > It is a bug that "-help" doesn't list discard=on, but QMP > query-command-line-options lists it correctly. > > libvirt could safely ignore discard if the underlying QEMU does not > support it, but that's not how it was implemented. Currently, > explicitly specifying either discard='on' and discard='off' will cause > the VM to fail to start if QEMU does not support it. There are > tradeoffs in both solutions... That sucks .. for me ... Can't we have an option like discard=ifpossible? libguestfs would use this, since we'd always prefer to honour discard requests from our kernel. Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones virt-p2v converts physical machines to virtual machines. Boot with a live CD or over the network (PXE) and turn machines into KVM guests. http://libguestfs.org/virt-v2v
Re: [Qemu-devel] TRIM/DISCARD/UNMAP support on qemu-nbd
Il 08/01/2014 23:24, Richard W.M. Jones ha scritto: > On Wed, Jan 08, 2014 at 11:11:35PM +0100, Paolo Bonzini wrote: >> Is guestfish using "discard=on"? > > No. > > Adding the discard=on parameter does indeed fix this: > > 13M/tmp/test1 > 17M/tmp/test2 > > However why isn't this the default? Is there a case where discard=on > would be undesirable? It is always safe if supported. However, it may cause performance problems, because discarding data from images may make them fragmented, or cause files to have a lot of extents. Similarly for block devices backed by some kind of thin-provisioning NAS (instead it should always be okay for SSDs). Unfortunately neither Linux nor the block devices really provide the information you need to know whether discard can cause these problems. > It's extremely difficult to know when it's safe to add this parameter. > Qemu gives no indication of when using discard=.. is safe (ie. won't > cause qemu to fail to start up or fail in some other way). It's even > worse when we have to go via libvirt which itself doesn't expose > qemu's capabilities upwards. It is a bug that "-help" doesn't list discard=on, but QMP query-command-line-options lists it correctly. libvirt could safely ignore discard if the underlying QEMU does not support it, but that's not how it was implemented. Currently, explicitly specifying either discard='on' and discard='off' will cause the VM to fail to start if QEMU does not support it. There are tradeoffs in both solutions... Paolo
Re: [Qemu-devel] TRIM/DISCARD/UNMAP support on qemu-nbd
On Wed, Jan 08, 2014 at 11:11:35PM +0100, Paolo Bonzini wrote: > Is guestfish using "discard=on"? No. Adding the discard=on parameter does indeed fix this: 13M/tmp/test1 17M/tmp/test2 However why isn't this the default? Is there a case where discard=on would be undesirable? It's extremely difficult to know when it's safe to add this parameter. Qemu gives no indication of when using discard=.. is safe (ie. won't cause qemu to fail to start up or fail in some other way). It's even worse when we have to go via libvirt which itself doesn't expose qemu's capabilities upwards. Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Fedora Windows cross-compiler. Compile Windows programs, test, and build Windows installers. Over 100 libraries supported. http://fedoraproject.org/wiki/MinGW
Re: [Qemu-devel] TRIM/DISCARD/UNMAP support on qemu-nbd
Il 07/01/2014 22:22, Richard W.M. Jones ha scritto: > On Tue, Jan 07, 2014 at 09:48:17PM +0100, Paolo Bonzini wrote: >> Il 07/01/2014 21:27, Richard W.M. Jones ha scritto: >>> Not much more what I said in the original email (especially see the >>> attached script which you can download from the bottom of this page: >>> https://lists.gnu.org/archive/html/qemu-devel/2014-01/msg00084.html ) >>> >>> Basically it tries to dd /dev/zero into the virtio-scsi device exposed >>> by qemu, then calls sg_unmap (there are two devices, it only unmaps >>> the first so we can hopefully see the difference), but it doesn't seem >>> to have any effect on the underlying file. The underlying file is a >>> regular raw-format file on ext4. >>> >>> I called sg_readcap/sg_vpd and we seem to have all the right >>> capability bits exposed. >>> >>> This script won't work with regular libguestfs. I compiled a special >>> appliance that had the sg tools included. >> >> Try again with the pull request of >> http://permalink.gmane.org/gmane.comp.emulators.qemu/248421 > > No difference from before, as far as I can see. > > Here is the output of sparsetest.sh: Is guestfish using "discard=on"? Here is my test: $ qemu-img create test.img 32M Formatting 'test.img', fmt=raw size=33554432 $ qemu-img map --output=json test.img [{ "start": 0, "length": 33554432, "depth": 0, "zero": true, "data": false, "offset": 0}] $ qemu-system-x86_64 ~/rhel6.img \ -drive if=none,cache=none,discard=on,file=test.img,id=test \ -device virtio-scsi-pci -device scsi-disk,drive=test \ --enable-kvm -m 512 In guest # sg_readcap /dev/sdb Device size: 33554432 bytes, 32.0 MiB, 0.03 GB # cat /sys/block/sdb/device/scsi_disk/*/provisioning_mode unmap # yes | mkfs.ext4 /dev/sdb # mount /dev/sdb test # dd if=/dev/zero of=test/test bs=1M $ du -sh test.img 32M test.img In guest # rm xfs/test (sync here if it does not work) # fstrim -v xfs/ xfs/: 27891712 bytes were trimmed $ du -sh test.img 5.2Mtest.img
Re: [Qemu-devel] TRIM/DISCARD/UNMAP support on qemu-nbd
On Tue, Jan 07, 2014 at 09:48:17PM +0100, Paolo Bonzini wrote: > Il 07/01/2014 21:27, Richard W.M. Jones ha scritto: > > Not much more what I said in the original email (especially see the > > attached script which you can download from the bottom of this page: > > https://lists.gnu.org/archive/html/qemu-devel/2014-01/msg00084.html ) > > > > Basically it tries to dd /dev/zero into the virtio-scsi device exposed > > by qemu, then calls sg_unmap (there are two devices, it only unmaps > > the first so we can hopefully see the difference), but it doesn't seem > > to have any effect on the underlying file. The underlying file is a > > regular raw-format file on ext4. > > > > I called sg_readcap/sg_vpd and we seem to have all the right > > capability bits exposed. > > > > This script won't work with regular libguestfs. I compiled a special > > appliance that had the sg tools included. > > Try again with the pull request of > http://permalink.gmane.org/gmane.comp.emulators.qemu/248421 No difference from before, as far as I can see. Here is the output of sparsetest.sh: 0/tmp/test1 0/tmp/test2 Read Capacity results: Protection: prot_en=0, p_type=0, p_i_exponent=0 Logical block provisioning: lbpme=1, lbprz=0 Last logical block address=204799 (0x31fff), Number of logical blocks=204800 Logical block length=512 bytes Logical blocks per physical block exponent=0 Lowest aligned logical block address=0 Hence: Device size: 104857600 bytes, 100.0 MiB, 0.10 GB Block limits VPD page (SBC): Write same no zero (WSNZ): 1 Maximum compare and write length: 0 blocks Optimal transfer length granularity: 0 blocks Maximum transfer length: 0 blocks Optimal transfer length: 0 blocks Maximum prefetch length: 0 blocks Maximum unmap LBA count: 2097152 Maximum unmap block descriptor count: 255 Optimal unmap granularity: 8 Unmap granularity alignment valid: 0 Unmap granularity alignment: 0 Maximum write same length: 0x0 blocks 16M /tmp/test1 <--- note both file disk 16M /tmp/test2 <--- usages are the same Those are raw files on ext4. I'll try qcow2 and follow up. Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones virt-p2v converts physical machines to virtual machines. Boot with a live CD or over the network (PXE) and turn machines into KVM guests. http://libguestfs.org/virt-v2v
Re: [Qemu-devel] TRIM/DISCARD/UNMAP support on qemu-nbd
Using qcow2 format, it also doesn't appear to work: $ /tmp/sparsetest.sh Formatting '/tmp/test1', fmt=qcow2 size=104857600 encryption=off cluster_size=65536 lazy_refcounts=off Formatting '/tmp/test2', fmt=qcow2 size=104857600 encryption=off cluster_size=65536 lazy_refcounts=off 136K /tmp/test1 136K /tmp/test2 Read Capacity results: Protection: prot_en=0, p_type=0, p_i_exponent=0 Logical block provisioning: lbpme=1, lbprz=0 Last logical block address=204799 (0x31fff), Number of logical blocks=204800 Logical block length=512 bytes Logical blocks per physical block exponent=0 Lowest aligned logical block address=0 Hence: Device size: 104857600 bytes, 100.0 MiB, 0.10 GB Block limits VPD page (SBC): Write same no zero (WSNZ): 1 Maximum compare and write length: 0 blocks Optimal transfer length granularity: 0 blocks Maximum transfer length: 0 blocks Optimal transfer length: 0 blocks Maximum prefetch length: 0 blocks Maximum unmap LBA count: 2097152 Maximum unmap block descriptor count: 255 Optimal unmap granularity: 8 Unmap granularity alignment valid: 0 Unmap granularity alignment: 0 Maximum write same length: 0x0 blocks 17M /tmp/test1 17M /tmp/test2 $ ll -h /tmp/test{1,2} -rw-r--r--. 1 rjones rjones 17M Jan 7 21:24 /tmp/test1 -rw-r--r--. 1 rjones rjones 17M Jan 7 21:24 /tmp/test2 $ qemu-img info /tmp/test1 image: /tmp/test1 file format: qcow2 virtual size: 100M (104857600 bytes) disk size: 16M cluster_size: 65536 $ qemu-img info /tmp/test2 image: /tmp/test2 file format: qcow2 virtual size: 100M (104857600 bytes) disk size: 16M cluster_size: 65536 -- A frustrating aspect of this is there's no diagnostics or way to probe if UNMAP is supported all the way through. This will be critical for virt-sparsify, since we'd like to be able to tell the user in advance whether or not in-place sparsification is going to work, and even better, why not. Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones libguestfs lets you edit virtual machines. Supports shell scripting, bindings from many languages. http://libguestfs.org
Re: [Qemu-devel] TRIM/DISCARD/UNMAP support on qemu-nbd
Il 07/01/2014 21:27, Richard W.M. Jones ha scritto: > Not much more what I said in the original email (especially see the > attached script which you can download from the bottom of this page: > https://lists.gnu.org/archive/html/qemu-devel/2014-01/msg00084.html ) > > Basically it tries to dd /dev/zero into the virtio-scsi device exposed > by qemu, then calls sg_unmap (there are two devices, it only unmaps > the first so we can hopefully see the difference), but it doesn't seem > to have any effect on the underlying file. The underlying file is a > regular raw-format file on ext4. > > I called sg_readcap/sg_vpd and we seem to have all the right > capability bits exposed. > > This script won't work with regular libguestfs. I compiled a special > appliance that had the sg tools included. Try again with the pull request of http://permalink.gmane.org/gmane.comp.emulators.qemu/248421 Paolo
Re: [Qemu-devel] TRIM/DISCARD/UNMAP support on qemu-nbd
On Tue, Jan 07, 2014 at 03:48:54PM +0100, Paolo Bonzini wrote: > Il 02/01/2014 17:15, Richard W.M. Jones ha scritto: > > > > My (possibly weak) understanding of the upstream qemu code is that > > unmap/discard/trim is not supported in qcow2. It is only supported in > > raw files when using a POSIX-like host OS which has either of: > > > > - block devices supporting BLKDISCARDZEROES > > - files on XFS > > - files on other filesystems that support FALLOC_FL_PUNCH_HOLE (eg ext4) > > It doesn't have to support BLKDISCARDZEROES, only BLKDISCARD. I test it > with scsi_debug using both lbprz=0 and lbprz=1 (which becomes > BLKDISCARDZEROES unset and set respectively). > > Otherwise this is correct. > > > Having said that, I did some tests using libguestfs and I could not > > show that unmap was working, either using raw or qcow2 (both on ext4), > > with virtio-scsi, and recent kernel & qemu. I did not see any errors, > > but also I don't see what I'm doing wrong. > > Can you share more? Not much more what I said in the original email (especially see the attached script which you can download from the bottom of this page: https://lists.gnu.org/archive/html/qemu-devel/2014-01/msg00084.html ) Basically it tries to dd /dev/zero into the virtio-scsi device exposed by qemu, then calls sg_unmap (there are two devices, it only unmaps the first so we can hopefully see the difference), but it doesn't seem to have any effect on the underlying file. The underlying file is a regular raw-format file on ext4. I called sg_readcap/sg_vpd and we seem to have all the right capability bits exposed. This script won't work with regular libguestfs. I compiled a special appliance that had the sg tools included. Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones virt-top is 'top' for virtual machines. Tiny program with many powerful monitoring features, net stats, disk stats, logging, etc. http://people.redhat.com/~rjones/virt-top
Re: [Qemu-devel] TRIM/DISCARD/UNMAP support on qemu-nbd
Il 02/01/2014 17:15, Richard W.M. Jones ha scritto: > > My (possibly weak) understanding of the upstream qemu code is that > unmap/discard/trim is not supported in qcow2. It is only supported in > raw files when using a POSIX-like host OS which has either of: > > - block devices supporting BLKDISCARDZEROES > - files on XFS > - files on other filesystems that support FALLOC_FL_PUNCH_HOLE (eg ext4) It doesn't have to support BLKDISCARDZEROES, only BLKDISCARD. I test it with scsi_debug using both lbprz=0 and lbprz=1 (which becomes BLKDISCARDZEROES unset and set respectively). Otherwise this is correct. > Having said that, I did some tests using libguestfs and I could not > show that unmap was working, either using raw or qcow2 (both on ext4), > with virtio-scsi, and recent kernel & qemu. I did not see any errors, > but also I don't see what I'm doing wrong. Can you share more? Paolo
Re: [Qemu-devel] TRIM/DISCARD/UNMAP support on qemu-nbd
On Mon, Dec 30, 2013 at 07:58:29PM +0800, Teng-Feng Yang wrote: > I have been studying QCOW2 file format for a couple of days, and I am > a little bit confused about whether QCOW2 supports UNMAP or not. Discard is an area that has seen a lot of development activity over the past year or two. That means it's still relatively new, you may find outdated information online, etc. > As I surf through internet, some mailing list discussion had mentioned > that qemu-nbd and nbd module both support UNMAP command. Yes: * qemu-nbd since QEMU 1.1 supports TRIM * nbd.ko since Linux 3.7 supports discard > So I follow the steps below on my machine (Ubuntu 13.10 with linux > kernel 3.12) to test if qemu-nbd and QCOW2 do support UNMAP. > > 1. Create a qcow2 file via qemu-img > > sudo qemu-img create -f qcow2 -o cluster_size=524288 base.qcow2 1G > > 2. Connect this qcow2 file with qemu-nbd > > sudo qemu-nbd -c /dev/nbd0 base.qcow2 --discard=unmap > > 3. Use sg_unmap command to issue UNMAP command to this NBD > > sudo sg_unmap --lba=0 --num=1 /dev/nbd0 > > Everytime I get the following error message: > > unmap cdb: 42 00 00 00 00 00 00 00 18 00 > unmap: pass through os error: Inappropriate ioctl for device > UNMAP failed (use '-v' to get more information) > > I also try to format this nbd device with EXT4 and mount it, but still > cannot perform fstrim on the mount point. NBD isn't a SCSI device so sending UNMAP commands doesn't work. I think you need to issue ioctl(BLKDISCARD) instead. See blkdiscard(8). Also, make sure to use qemu.git/master if you want qcow2 discard support. I didn't check the details but the unmap implementation for qcow2 has recently been added/modified. Stefan
Re: [Qemu-devel] TRIM/DISCARD/UNMAP support on qemu-nbd
On Mon, Dec 30, 2013 at 07:58:29PM +0800, Teng-Feng Yang wrote: > I have been studying QCOW2 file format for a couple of days, and I am > a little bit confused about whether QCOW2 supports UNMAP or not. > As I surf through internet, some mailing list discussion had mentioned > that qemu-nbd and nbd module both support UNMAP command. > So I follow the steps below on my machine (Ubuntu 13.10 with linux > kernel 3.12) to test if qemu-nbd and QCOW2 do support UNMAP. > > 1. Create a qcow2 file via qemu-img > > sudo qemu-img create -f qcow2 -o cluster_size=524288 base.qcow2 1G > > 2. Connect this qcow2 file with qemu-nbd > > sudo qemu-nbd -c /dev/nbd0 base.qcow2 --discard=unmap > > 3. Use sg_unmap command to issue UNMAP command to this NBD > > sudo sg_unmap --lba=0 --num=1 /dev/nbd0 > > Everytime I get the following error message: > > unmap cdb: 42 00 00 00 00 00 00 00 18 00 > unmap: pass through os error: Inappropriate ioctl for device > UNMAP failed (use '-v' to get more information) > > I also try to format this nbd device with EXT4 and mount it, but still > cannot perform fstrim on the mount point. > > Have I done anything wrong? There are a lot of factors for getting unmap/discard/trim to work, including: - guest tools (sg_unmap) or guest filesystem must support it - guest kernel must support it - host qemu must support it - host filesystem/etc must support it My (possibly weak) understanding of the upstream qemu code is that unmap/discard/trim is not supported in qcow2. It is only supported in raw files when using a POSIX-like host OS which has either of: - block devices supporting BLKDISCARDZEROES - files on XFS - files on other filesystems that support FALLOC_FL_PUNCH_HOLE (eg ext4) Having said that, I did some tests using libguestfs and I could not show that unmap was working, either using raw or qcow2 (both on ext4), with virtio-scsi, and recent kernel & qemu. I did not see any errors, but also I don't see what I'm doing wrong. Attached is my test script. You will need to compile libguestfs with: ./configure --with-extra-packages="sg3_utils" The results on my machine are: $ /tmp/sparsetest.sh 0 /tmp/test1 0 /tmp/test2 Read Capacity results: Protection: prot_en=0, p_type=0, p_i_exponent=0 Logical block provisioning: lbpme=1, lbprz=0 Last logical block address=204799 (0x31fff), Number of logical blocks=204800 Logical block length=512 bytes Logical blocks per physical block exponent=0 Lowest aligned logical block address=0 Hence: Device size: 104857600 bytes, 100.0 MiB, 0.10 GB Block limits VPD page (SBC): Write same no zero (WSNZ): 1 Maximum compare and write length: 0 blocks Optimal transfer length granularity: 0 blocks Maximum transfer length: 0 blocks Optimal transfer length: 0 blocks Maximum prefetch length: 0 blocks Maximum unmap LBA count: 0 Maximum unmap block descriptor count: 0 Optimal unmap granularity: 8 Unmap granularity alignment valid: 0 Unmap granularity alignment: 0 Maximum write same length: 0x0 blocks 16M /tmp/test1 <--- note no sparseness is created 16M /tmp/test2 Please let us know if you get this working, because I'd really like to fix virt-sparsify so it can work in-place! Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones virt-p2v converts physical machines to virtual machines. Boot with a live CD or over the network (PXE) and turn machines into KVM guests. http://libguestfs.org/virt-v2v sparsetest.sh Description: Bourne shell script
[Qemu-devel] TRIM/DISCARD/UNMAP support on qemu-nbd
Hi folks, I have been studying QCOW2 file format for a couple of days, and I am a little bit confused about whether QCOW2 supports UNMAP or not. As I surf through internet, some mailing list discussion had mentioned that qemu-nbd and nbd module both support UNMAP command. So I follow the steps below on my machine (Ubuntu 13.10 with linux kernel 3.12) to test if qemu-nbd and QCOW2 do support UNMAP. 1. Create a qcow2 file via qemu-img > sudo qemu-img create -f qcow2 -o cluster_size=524288 base.qcow2 1G 2. Connect this qcow2 file with qemu-nbd > sudo qemu-nbd -c /dev/nbd0 base.qcow2 --discard=unmap 3. Use sg_unmap command to issue UNMAP command to this NBD > sudo sg_unmap --lba=0 --num=1 /dev/nbd0 Everytime I get the following error message: unmap cdb: 42 00 00 00 00 00 00 00 18 00 unmap: pass through os error: Inappropriate ioctl for device UNMAP failed (use '-v' to get more information) I also try to format this nbd device with EXT4 and mount it, but still cannot perform fstrim on the mount point. Have I done anything wrong? Any help would be grateful. Thanks. Best Regards, Dennis