On 11/21/2012 02:44 PM, Stefan Hajnoczi wrote: > On Wed, Nov 21, 2012 at 7:42 AM, Asias He <as...@redhat.com> wrote: >> On 11/21/2012 01:39 PM, Asias He wrote: >>> On 11/20/2012 08:25 PM, Stefan Hajnoczi wrote: >>>> On Tue, Nov 20, 2012 at 1:21 PM, Stefan Hajnoczi <stefa...@gmail.com> >>>> wrote: >>>>> On Tue, Nov 20, 2012 at 10:02 AM, Asias He <as...@redhat.com> wrote: >>>>>> Hello Stefan, >>>>>> >>>>>> On 11/15/2012 11:18 PM, Stefan Hajnoczi wrote: >>>>>>> This series adds the -device virtio-blk-pci,x-data-plane=on property >>>>>>> that >>>>>>> enables a high performance I/O codepath. A dedicated thread is used to >>>>>>> process >>>>>>> virtio-blk requests outside the global mutex and without going through >>>>>>> the QEMU >>>>>>> block layer. >>>>>>> >>>>>>> Khoa Huynh <k...@us.ibm.com> reported an increase from 140,000 IOPS to >>>>>>> 600,000 >>>>>>> IOPS for a single VM using virtio-blk-data-plane in July: >>>>>>> >>>>>>> http://comments.gmane.org/gmane.comp.emulators.kvm.devel/94580 >>>>>>> >>>>>>> The virtio-blk-data-plane approach was originally presented at Linux >>>>>>> Plumbers >>>>>>> Conference 2010. The following slides contain a brief overview: >>>>>>> >>>>>>> >>>>>>> http://linuxplumbersconf.org/2010/ocw/system/presentations/651/original/Optimizing_the_QEMU_Storage_Stack.pdf >>>>>>> >>>>>>> The basic approach is: >>>>>>> 1. Each virtio-blk device has a thread dedicated to handling ioeventfd >>>>>>> signalling when the guest kicks the virtqueue. >>>>>>> 2. Requests are processed without going through the QEMU block layer >>>>>>> using >>>>>>> Linux AIO directly. >>>>>>> 3. Completion interrupts are injected via irqfd from the dedicated >>>>>>> thread. >>>>>>> >>>>>>> To try it out: >>>>>>> >>>>>>> qemu -drive >>>>>>> if=none,id=drive0,cache=none,aio=native,format=raw,file=... >>>>>>> -device virtio-blk-pci,drive=drive0,scsi=off,x-data-plane=on >>>>>> >>>>>> >>>>>> Is this the latest dataplane bits: >>>>>> (git://github.com/stefanha/qemu.git virtio-blk-data-plane) >>>>>> >>>>>> commit 7872075c24fa01c925d4f41faa9d04ce69bf5328 >>>>>> Author: Stefan Hajnoczi <stefa...@redhat.com> >>>>>> Date: Wed Nov 14 15:45:38 2012 +0100 >>>>>> >>>>>> virtio-blk: add x-data-plane=on|off performance feature >>>>>> >>>>>> >>>>>> With this commit on a ramdisk based box, I am seeing about 10K IOPS with >>>>>> x-data-plane on and 90K IOPS with x-data-plane off. >>>>>> >>>>>> Any ideas? >>>>>> >>>>>> Command line I used: >>>>>> >>>>>> IMG=/dev/ram0 >>>>>> x86_64-softmmu/qemu-system-x86_64 \ >>>>>> -drive file=/root/img/sid.img,if=ide \ >>>>>> -drive file=${IMG},if=none,cache=none,aio=native,id=disk1 -device >>>>>> virtio-blk-pci,x-data-plane=off,drive=disk1,scsi=off \ >>>>>> -kernel $KERNEL -append "root=/dev/sdb1 console=tty0" \ >>>>>> -L /tmp/qemu-dataplane/share/qemu/ -nographic -vnc :0 -enable-kvm -m >>>>>> 2048 -smp 4 -cpu qemu64,+x2apic -M pc >>>>> >>>>> Was just about to send out the latest patch series which addresses >>>>> review comments, so I have tested the latest code >>>>> (61b70fef489ce51ecd18d69afb9622c110b9315c). >>>> >>>> Rebased onto qemu.git/master before sending out. The commit ID is now: >>>> cf6ed6406543ecc43895012a9ac9665e3753d5e8 >>>> >>>> https://github.com/stefanha/qemu/commits/virtio-blk-data-plane >>>> >>>> Stefan >>> >>> Ok, thanks. /me trying >> >> Hi Stefan, >> >> If I enable the merge in guest the IOPS for seq read/write goes up to >> ~400K/300K. If I disable the merge in guest the IOPS drops to ~17K/24K >> for seq read/write (which is similar to the result I posted yesterday, >> with merge disalbed). Could you please also share the numbers for rand >> read and write in your setup? > > Thanks for running the test. Please send your rand read/write fio > jobfile so I can run the exact same test. > > BTW I was running the default F18 (host) and RHEL 6.3 (guest) I/O > schedulers in my test yesterday.
Sure, this is the script I used to run the test in guest. # cat run.sh #!/bin/sh cat > all.fio <<EOF [global] exec_prerun="echo 3 > /proc/sys/vm/drop_caches" group_reporting ioscheduler=noop thread bs=4k size=512MB direct=1 filename=/dev/vda numjobs=16 ioengine=libaio iodepth=64 loops=3 ramp_time=0 [seq-read] stonewall rw=read [seq-write] stonewall rw=write [rnd-read] stonewall rw=randread [rnd-write] stonewall rw=randwrite EOF echo noop > /sys/block/vda/queue/scheduler echo 2 > /sys/block/vda/queue/nomerges echo 0 > /sys/block/vda/queue/nomerges fio all.fio --output=f.log echo echo ------------------------------------- cat f.log|grep iops cat f.log|grep clat |grep avg cat f.log|grep cpu cat f.log|grep ios |grep in_queue cat /proc/interrupts |grep requests |grep virtio |grep 41\: cat /proc/stat |grep ^intr | awk '{print " 41: interrupt in total: "$44}' fio all.fio --showcmd -- Asias