Re: [Qemu-devel] [PATCH 0/7] virtio: virtio-blk data plane

Asias He Tue, 20 Nov 2012 21:21:35 -0800

On 11/20/2012 08:21 PM, Stefan Hajnoczi wrote:
> On Tue, Nov 20, 2012 at 10:02 AM, Asias He <as...@redhat.com> wrote:
>> Hello Stefan,
>>
>> On 11/15/2012 11:18 PM, Stefan Hajnoczi wrote:
>>> This series adds the -device virtio-blk-pci,x-data-plane=on property that
>>> enables a high performance I/O codepath.  A dedicated thread is used to 
>>> process
>>> virtio-blk requests outside the global mutex and without going through the 
>>> QEMU
>>> block layer.
>>>
>>> Khoa Huynh <k...@us.ibm.com> reported an increase from 140,000 IOPS to 
>>> 600,000
>>> IOPS for a single VM using virtio-blk-data-plane in July:
>>>
>>>   http://comments.gmane.org/gmane.comp.emulators.kvm.devel/94580
>>>
>>> The virtio-blk-data-plane approach was originally presented at Linux 
>>> Plumbers
>>> Conference 2010.  The following slides contain a brief overview:
>>>
>>>   
>>> http://linuxplumbersconf.org/2010/ocw/system/presentations/651/original/Optimizing_the_QEMU_Storage_Stack.pdf
>>>
>>> The basic approach is:
>>> 1. Each virtio-blk device has a thread dedicated to handling ioeventfd
>>>    signalling when the guest kicks the virtqueue.
>>> 2. Requests are processed without going through the QEMU block layer using
>>>    Linux AIO directly.
>>> 3. Completion interrupts are injected via irqfd from the dedicated thread.
>>>
>>> To try it out:
>>>
>>>   qemu -drive if=none,id=drive0,cache=none,aio=native,format=raw,file=...
>>>        -device virtio-blk-pci,drive=drive0,scsi=off,x-data-plane=on
>>
>>
>> Is this the latest dataplane bits:
>> (git://github.com/stefanha/qemu.git virtio-blk-data-plane)
>>
>> commit 7872075c24fa01c925d4f41faa9d04ce69bf5328
>> Author: Stefan Hajnoczi <stefa...@redhat.com>
>> Date:   Wed Nov 14 15:45:38 2012 +0100
>>
>>     virtio-blk: add x-data-plane=on|off performance feature
>>
>>
>> With this commit on a ramdisk based box, I am seeing about 10K IOPS with
>> x-data-plane on and 90K IOPS with x-data-plane off.
>>
>> Any ideas?
>>
>> Command line I used:
>>
>> IMG=/dev/ram0
>> x86_64-softmmu/qemu-system-x86_64 \
>> -drive file=/root/img/sid.img,if=ide \
>> -drive file=${IMG},if=none,cache=none,aio=native,id=disk1 -device
>> virtio-blk-pci,x-data-plane=off,drive=disk1,scsi=off \
>> -kernel $KERNEL -append "root=/dev/sdb1 console=tty0" \
>> -L /tmp/qemu-dataplane/share/qemu/ -nographic -vnc :0 -enable-kvm -m
>> 2048 -smp 4 -cpu qemu64,+x2apic -M pc
> 
> Was just about to send out the latest patch series which addresses
> review comments, so I have tested the latest code
> (61b70fef489ce51ecd18d69afb9622c110b9315c).
> 
> I was unable to reproduce a ramdisk performance regression on Linux
> 3.6.6-3.fc18.x86_64 with Intel(R) Core(TM) i7-3520M CPU @ 2.90GHz with
> 8 GB RAM.


I am using the latest upstream kernel.

> The ramdisk is 4 GB and I used your QEMU command-line with a RHEL 6.3 guest.
> 
> Summary results:
> x-data-plane-on: iops=132856 aggrb=1039.1MB/s
> x-data-plane-off: iops=126236 aggrb=988.40MB/s
> 
> virtio-blk-data-plane is ~5% faster in this benchmark.
> 
> fio jobfile:
> [global]
> filename=/dev/vda
> blocksize=8k
> ioengine=libaio
> direct=1
> iodepth=8
> runtime=120
> time_based=1
> 
> [reads]
> readwrite=randread
> numjobs=4
> 
> Perf top (data-plane-on):
>   3.71%  [kvm]               [k] kvm_arch_vcpu_ioctl_run
>   3.27%  [kernel]            [k] memset    <--- ramdisk
>   2.98%  [kernel]            [k] do_blockdev_direct_IO
>   2.82%  [kvm_intel]         [k] vmx_vcpu_run
>   2.66%  [kernel]            [k] _raw_spin_lock_irqsave
>   2.06%  [kernel]            [k] put_compound_page
>   2.06%  [kernel]            [k] __get_page_tail
>   1.83%  [i915]              [k] __gen6_gt_force_wake_mt_get
>   1.75%  [kernel]            [k] _raw_spin_unlock_irqrestore
>   1.33%  qemu-system-x86_64  [.] vring_pop <--- virtio-blk-data-plane
>   1.19%  [kernel]            [k] compound_unlock_irqrestore
>   1.13%  [kernel]            [k] gup_huge_pmd
>   1.11%  [kernel]            [k] __audit_syscall_exit
>   1.07%  [kernel]            [k] put_page_testzero
>   1.01%  [kernel]            [k] fget
>   1.01%  [kernel]            [k] do_io_submit
> 
> Since the ramdisk (memset and page-related functions) is so prominent
> in perf top, I also tried a 1-job 8k dd sequential write test on a
> Samsung 830 Series SSD where virtio-blk-data-plane was 9% faster than
> virtio-blk.  Optimizing against ramdisk isn't a good idea IMO because
> it acts very differently from real hardware where the driver relies on
> mmio, DMA, and interrupts (vs synchronous memcpy/memset).

For the memset in ramdisk, you can simply patch drivers/block/brd.c to
do nop instead of memset for testing.

Yes, if you have fast SSD device （sometimes you need multiple which I
do not have）, it makes more sense to test on real hardware. However,
ramdisk test is still useful. It gives rough performance numbers. If A
and B are both tested against ramdisk. The difference between A and B
are still useful.


> Full results:
> $ cat data-plane-off
> reads: (g=0): rw=randread, bs=8K-8K/8K-8K, ioengine=libaio, iodepth=8
> ...
> reads: (g=0): rw=randread, bs=8K-8K/8K-8K, ioengine=libaio, iodepth=8
> fio 1.57
> Starting 4 processes
> 
> reads: (groupid=0, jobs=1): err= 0: pid=1851
>   read : io=29408MB, bw=250945KB/s, iops=31368 , runt=120001msec
>     slat (usec): min=2 , max=27829 , avg=11.06, stdev=78.05
>     clat (usec): min=1 , max=28028 , avg=241.41, stdev=388.47
>      lat (usec): min=33 , max=28035 , avg=253.17, stdev=396.66
>     bw (KB/s) : min=197141, max=335365, per=24.78%, avg=250797.02,
> stdev=29376.35
>   cpu          : usr=6.55%, sys=31.34%, ctx=310932, majf=0, minf=41
>   IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=100.0%, 16=0.0%, 32=0.0%, >=64=0.0%
>      submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, 
> >=64=0.0%
>      complete  : 0=0.0%, 4=100.0%, 8=0.1%, 16=0.0%, 32=0.0%, 64=0.0%, 
> >=64=0.0%
>      issued r/w/d: total=3764202/0/0, short=0/0/0
>      lat (usec): 2=0.01%, 4=0.01%, 20=0.01%, 50=1.78%, 100=27.11%
>      lat (usec): 250=38.97%, 500=27.11%, 750=2.09%, 1000=0.71%
>      lat (msec): 2=1.32%, 4=0.70%, 10=0.20%, 20=0.01%, 50=0.01%
> reads: (groupid=0, jobs=1): err= 0: pid=1852
>   read : io=29742MB, bw=253798KB/s, iops=31724 , runt=120001msec
>     slat (usec): min=2 , max=17007 , avg=10.61, stdev=67.51
>     clat (usec): min=1 , max=41531 , avg=239.00, stdev=379.03
>      lat (usec): min=32 , max=41547 , avg=250.33, stdev=385.21
>     bw (KB/s) : min=194336, max=347497, per=25.02%, avg=253204.25,
> stdev=31172.37
>   cpu          : usr=6.66%, sys=32.58%, ctx=327250, majf=0, minf=41
>   IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=100.0%, 16=0.0%, 32=0.0%, >=64=0.0%
>      submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, 
> >=64=0.0%
>      complete  : 0=0.0%, 4=100.0%, 8=0.1%, 16=0.0%, 32=0.0%, 64=0.0%, 
> >=64=0.0%
>      issued r/w/d: total=3806999/0/0, short=0/0/0
>      lat (usec): 2=0.01%, 20=0.01%, 50=1.54%, 100=26.45%, 250=40.04%
>      lat (usec): 500=27.15%, 750=1.95%, 1000=0.71%
>      lat (msec): 2=1.29%, 4=0.68%, 10=0.18%, 20=0.01%, 50=0.01%
> reads: (groupid=0, jobs=1): err= 0: pid=1853
>   read : io=29859MB, bw=254797KB/s, iops=31849 , runt=120001msec
>     slat (usec): min=2 , max=16821 , avg=11.35, stdev=76.54
>     clat (usec): min=1 , max=17659 , avg=237.25, stdev=375.31
>      lat (usec): min=31 , max=17673 , avg=249.27, stdev=383.62
>     bw (KB/s) : min=194864, max=345280, per=25.15%, avg=254534.63,
> stdev=30549.32
>   cpu          : usr=6.52%, sys=31.84%, ctx=303763, majf=0, minf=39
>   IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=100.0%, 16=0.0%, 32=0.0%, >=64=0.0%
>      submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, 
> >=64=0.0%
>      complete  : 0=0.0%, 4=100.0%, 8=0.1%, 16=0.0%, 32=0.0%, 64=0.0%, 
> >=64=0.0%
>      issued r/w/d: total=3821989/0/0, short=0/0/0
>      lat (usec): 2=0.01%, 10=0.01%, 20=0.01%, 50=2.09%, 100=29.19%
>      lat (usec): 250=37.31%, 500=26.41%, 750=2.08%, 1000=0.71%
>      lat (msec): 2=1.32%, 4=0.70%, 10=0.20%, 20=0.01%
> reads: (groupid=0, jobs=1): err= 0: pid=1854
>   read : io=29598MB, bw=252565KB/s, iops=31570 , runt=120001msec
>     slat (usec): min=2 , max=26413 , avg=11.21, stdev=78.32
>     clat (usec): min=16 , max=27993 , avg=239.56, stdev=381.67
>      lat (usec): min=34 , max=28006 , avg=251.49, stdev=390.13
>     bw (KB/s) : min=194256, max=369424, per=24.94%, avg=252462.86,
> stdev=29420.58
>   cpu          : usr=6.57%, sys=31.33%, ctx=305623, majf=0, minf=41
>   IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=100.0%, 16=0.0%, 32=0.0%, >=64=0.0%
>      submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, 
> >=64=0.0%
>      complete  : 0=0.0%, 4=100.0%, 8=0.1%, 16=0.0%, 32=0.0%, 64=0.0%, 
> >=64=0.0%
>      issued r/w/d: total=3788507/0/0, short=0/0/0
>      lat (usec): 20=0.01%, 50=2.13%, 100=28.30%, 250=37.74%, 500=26.66%
>      lat (usec): 750=2.17%, 1000=0.75%
>      lat (msec): 2=1.35%, 4=0.70%, 10=0.19%, 20=0.01%, 50=0.01%
> 
> Run status group 0 (all jobs):
>    READ: io=118607MB, aggrb=988.40MB/s, minb=256967KB/s,
> maxb=260912KB/s, mint=120001msec, maxt=120001msec
> 
> Disk stats (read/write):
>   vda: ios=15148328/0, merge=0/0, ticks=1550570/0, in_queue=1536232, 
> util=96.56%
> 
> $ cat data-plane-on
> reads: (g=0): rw=randread, bs=8K-8K/8K-8K, ioengine=libaio, iodepth=8
> ...
> reads: (g=0): rw=randread, bs=8K-8K/8K-8K, ioengine=libaio, iodepth=8
> fio 1.57
> Starting 4 processes
> 
> reads: (groupid=0, jobs=1): err= 0: pid=1796
>   read : io=32081MB, bw=273759KB/s, iops=34219 , runt=120001msec
>     slat (usec): min=1 , max=20404 , avg=21.08, stdev=125.49
>     clat (usec): min=10 , max=135743 , avg=207.62, stdev=532.90
>      lat (usec): min=21 , max=136055 , avg=229.60, stdev=556.82
>     bw (KB/s) : min=56480, max=951952, per=25.49%, avg=271488.81,
> stdev=149773.57
>   cpu          : usr=7.01%, sys=43.26%, ctx=336854, majf=0, minf=41
>   IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=100.0%, 16=0.0%, 32=0.0%, >=64=0.0%
>      submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, 
> >=64=0.0%
>      complete  : 0=0.0%, 4=100.0%, 8=0.1%, 16=0.0%, 32=0.0%, 64=0.0%, 
> >=64=0.0%
>      issued r/w/d: total=4106413/0/0, short=0/0/0
>      lat (usec): 20=0.01%, 50=2.46%, 100=61.13%, 250=21.58%, 500=3.11%
>      lat (usec): 750=3.04%, 1000=3.88%
>      lat (msec): 2=4.50%, 4=0.13%, 10=0.11%, 20=0.06%, 50=0.01%
>      lat (msec): 250=0.01%
> reads: (groupid=0, jobs=1): err= 0: pid=1797
>   read : io=30104MB, bw=256888KB/s, iops=32110 , runt=120001msec
>     slat (usec): min=1 , max=17595 , avg=22.20, stdev=120.29
>     clat (usec): min=13 , max=136264 , avg=221.21, stdev=528.19
>      lat (usec): min=22 , max=136280 , avg=244.35, stdev=551.73
>     bw (KB/s) : min=57312, max=838880, per=23.93%, avg=254798.51,
> stdev=139546.57
>   cpu          : usr=6.82%, sys=41.87%, ctx=360348, majf=0, minf=41
>   IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=100.0%, 16=0.0%, 32=0.0%, >=64=0.0%
>      submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, 
> >=64=0.0%
>      complete  : 0=0.0%, 4=100.0%, 8=0.1%, 16=0.0%, 32=0.0%, 64=0.0%, 
> >=64=0.0%
>      issued r/w/d: total=3853351/0/0, short=0/0/0
>      lat (usec): 20=0.01%, 50=2.10%, 100=58.47%, 250=22.38%, 500=3.68%
>      lat (usec): 750=3.69%, 1000=4.52%
>      lat (msec): 2=4.87%, 4=0.14%, 10=0.11%, 20=0.05%, 250=0.01%
> reads: (groupid=0, jobs=1): err= 0: pid=1798
>   read : io=31698MB, bw=270487KB/s, iops=33810 , runt=120001msec
>     slat (usec): min=1 , max=17457 , avg=20.93, stdev=125.33
>     clat (usec): min=16 , max=134663 , avg=210.19, stdev=535.77
>      lat (usec): min=21 , max=134671 , avg=232.02, stdev=559.27
>     bw (KB/s) : min=57248, max=841952, per=25.29%, avg=269330.21,
> stdev=148661.08
>   cpu          : usr=6.92%, sys=42.81%, ctx=337799, majf=0, minf=39
>   IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=100.0%, 16=0.0%, 32=0.0%, >=64=0.0%
>      submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, 
> >=64=0.0%
>      complete  : 0=0.0%, 4=100.0%, 8=0.1%, 16=0.0%, 32=0.0%, 64=0.0%, 
> >=64=0.0%
>      issued r/w/d: total=4057340/0/0, short=0/0/0
>      lat (usec): 20=0.01%, 50=1.98%, 100=62.00%, 250=20.70%, 500=3.22%
>      lat (usec): 750=3.23%, 1000=4.16%
>      lat (msec): 2=4.41%, 4=0.13%, 10=0.10%, 20=0.06%, 250=0.01%
> reads: (groupid=0, jobs=1): err= 0: pid=1799
>   read : io=30913MB, bw=263789KB/s, iops=32973 , runt=120000msec
>     slat (usec): min=1 , max=17565 , avg=21.52, stdev=120.17
>     clat (usec): min=15 , max=136064 , avg=215.53, stdev=529.56
>      lat (usec): min=27 , max=136070 , avg=237.99, stdev=552.50
>     bw (KB/s) : min=57632, max=900896, per=24.74%, avg=263431.57,
> stdev=148379.15
>   cpu          : usr=6.90%, sys=42.56%, ctx=348217, majf=0, minf=41
>   IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=100.0%, 16=0.0%, 32=0.0%, >=64=0.0%
>      submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, 
> >=64=0.0%
>      complete  : 0=0.0%, 4=100.0%, 8=0.1%, 16=0.0%, 32=0.0%, 64=0.0%, 
> >=64=0.0%
>      issued r/w/d: total=3956830/0/0, short=0/0/0
>      lat (usec): 20=0.01%, 50=1.76%, 100=59.96%, 250=22.21%, 500=3.45%
>      lat (usec): 750=3.35%, 1000=4.33%
>      lat (msec): 2=4.65%, 4=0.13%, 10=0.11%, 20=0.05%, 250=0.01%
> 
> Run status group 0 (all jobs):
>    READ: io=124796MB, aggrb=1039.1MB/s, minb=263053KB/s,
> maxb=280328KB/s, mint=120000msec, maxt=120001msec
> 
> Disk stats (read/write):
>   vda: ios=15942789/0, merge=0/0, ticks=336240/0, in_queue=317832, util=97.47%
> 


-- 
Asias

Re: [Qemu-devel] [PATCH 0/7] virtio: virtio-blk data plane

Reply via email to