On 11/21/2012 02:44 PM, Stefan Hajnoczi wrote:
> On Wed, Nov 21, 2012 at 7:42 AM, Asias He <as...@redhat.com> wrote:
>> On 11/21/2012 01:39 PM, Asias He wrote:
>>> On 11/20/2012 08:25 PM, Stefan Hajnoczi wrote:
>>>> On Tue, Nov 20, 2012 at 1:21 PM, Stefan Hajnoczi <stefa...@gmail.com> 
>>>> wrote:
>>>>> On Tue, Nov 20, 2012 at 10:02 AM, Asias He <as...@redhat.com> wrote:
>>>>>> Hello Stefan,
>>>>>>
>>>>>> On 11/15/2012 11:18 PM, Stefan Hajnoczi wrote:
>>>>>>> This series adds the -device virtio-blk-pci,x-data-plane=on property 
>>>>>>> that
>>>>>>> enables a high performance I/O codepath.  A dedicated thread is used to 
>>>>>>> process
>>>>>>> virtio-blk requests outside the global mutex and without going through 
>>>>>>> the QEMU
>>>>>>> block layer.
>>>>>>>
>>>>>>> Khoa Huynh <k...@us.ibm.com> reported an increase from 140,000 IOPS to 
>>>>>>> 600,000
>>>>>>> IOPS for a single VM using virtio-blk-data-plane in July:
>>>>>>>
>>>>>>>   http://comments.gmane.org/gmane.comp.emulators.kvm.devel/94580
>>>>>>>
>>>>>>> The virtio-blk-data-plane approach was originally presented at Linux 
>>>>>>> Plumbers
>>>>>>> Conference 2010.  The following slides contain a brief overview:
>>>>>>>
>>>>>>>   
>>>>>>> http://linuxplumbersconf.org/2010/ocw/system/presentations/651/original/Optimizing_the_QEMU_Storage_Stack.pdf
>>>>>>>
>>>>>>> The basic approach is:
>>>>>>> 1. Each virtio-blk device has a thread dedicated to handling ioeventfd
>>>>>>>    signalling when the guest kicks the virtqueue.
>>>>>>> 2. Requests are processed without going through the QEMU block layer 
>>>>>>> using
>>>>>>>    Linux AIO directly.
>>>>>>> 3. Completion interrupts are injected via irqfd from the dedicated 
>>>>>>> thread.
>>>>>>>
>>>>>>> To try it out:
>>>>>>>
>>>>>>>   qemu -drive 
>>>>>>> if=none,id=drive0,cache=none,aio=native,format=raw,file=...
>>>>>>>        -device virtio-blk-pci,drive=drive0,scsi=off,x-data-plane=on
>>>>>>
>>>>>>
>>>>>> Is this the latest dataplane bits:
>>>>>> (git://github.com/stefanha/qemu.git virtio-blk-data-plane)
>>>>>>
>>>>>> commit 7872075c24fa01c925d4f41faa9d04ce69bf5328
>>>>>> Author: Stefan Hajnoczi <stefa...@redhat.com>
>>>>>> Date:   Wed Nov 14 15:45:38 2012 +0100
>>>>>>
>>>>>>     virtio-blk: add x-data-plane=on|off performance feature
>>>>>>
>>>>>>
>>>>>> With this commit on a ramdisk based box, I am seeing about 10K IOPS with
>>>>>> x-data-plane on and 90K IOPS with x-data-plane off.
>>>>>>
>>>>>> Any ideas?
>>>>>>
>>>>>> Command line I used:
>>>>>>
>>>>>> IMG=/dev/ram0
>>>>>> x86_64-softmmu/qemu-system-x86_64 \
>>>>>> -drive file=/root/img/sid.img,if=ide \
>>>>>> -drive file=${IMG},if=none,cache=none,aio=native,id=disk1 -device
>>>>>> virtio-blk-pci,x-data-plane=off,drive=disk1,scsi=off \
>>>>>> -kernel $KERNEL -append "root=/dev/sdb1 console=tty0" \
>>>>>> -L /tmp/qemu-dataplane/share/qemu/ -nographic -vnc :0 -enable-kvm -m
>>>>>> 2048 -smp 4 -cpu qemu64,+x2apic -M pc
>>>>>
>>>>> Was just about to send out the latest patch series which addresses
>>>>> review comments, so I have tested the latest code
>>>>> (61b70fef489ce51ecd18d69afb9622c110b9315c).
>>>>
>>>> Rebased onto qemu.git/master before sending out.  The commit ID is now:
>>>> cf6ed6406543ecc43895012a9ac9665e3753d5e8
>>>>
>>>> https://github.com/stefanha/qemu/commits/virtio-blk-data-plane
>>>>
>>>> Stefan
>>>
>>> Ok, thanks. /me trying
>>
>> Hi Stefan,
>>
>> If I enable the merge in guest the IOPS for seq read/write goes up to
>> ~400K/300K. If I disable the merge in guest the IOPS drops to ~17K/24K
>> for seq read/write (which is similar to the result I posted yesterday,
>> with merge disalbed). Could you please also share the numbers for rand
>> read and write in your setup?
> 
> Thanks for running the test.  Please send your rand read/write fio
> jobfile so I can run the exact same test.
> 
> BTW I was running the default F18 (host) and RHEL 6.3 (guest) I/O
> schedulers in my test yesterday.

Sure, this is the script I used to run the test in guest.

# cat run.sh
#!/bin/sh

cat > all.fio <<EOF
[global]
exec_prerun="echo 3 > /proc/sys/vm/drop_caches"
group_reporting
ioscheduler=noop
thread
bs=4k
size=512MB
direct=1
filename=/dev/vda
numjobs=16
ioengine=libaio
iodepth=64
loops=3
ramp_time=0
[seq-read]
stonewall
rw=read
[seq-write]
stonewall
rw=write
[rnd-read]
stonewall
rw=randread
[rnd-write]
stonewall
rw=randwrite
EOF

echo noop > /sys/block/vda/queue/scheduler
echo 2 > /sys/block/vda/queue/nomerges
echo 0 > /sys/block/vda/queue/nomerges
fio all.fio --output=f.log
echo
echo -------------------------------------
cat f.log|grep iops
cat f.log|grep clat |grep avg
cat f.log|grep cpu
cat f.log|grep ios |grep in_queue
cat /proc/interrupts |grep requests |grep virtio |grep 41\:
cat /proc/stat |grep ^intr | awk '{print " 41: interrupt in total: "$44}'
fio all.fio --showcmd



-- 
Asias

Reply via email to