Hi,

I have ran all io pattern on my another host.
Notes:
q1 means fio iodepth = 1
j1 means fio num jobs = 1

VCPU = 4, VMEM = 2GiB, fio using directio.

The results of most jobs are better than the deadline, and some are lower than 
the deadline。


pattern                        |  iops(mq-deadline)    | iops(none)       | diff
:-                             | -:                    | -:               | -:
4k-randread-q1-j1              | 12325             |     13356          |8.37%
256k-randread-q1-j1            | 1865              |     1883           |0.97%
4k-randread-q128-j1            | 204739            |     319066         |55.84%
256k-randread-q128-j1          | 24257             |     22851          |-5.80%
4k-randwrite-q1-j1             | 9923              |     10163          |2.42%
256k-randwrite-q1-j1           | 2762              |     2833           |2.57%
4k-randwrite-q128-j1           | 137400            |     152081         |10.68%
256k-randwrite-q128-j1         | 9353              |     9233           |-1.28%
4k-read-q1-j1                  | 21499             |     22223          |3.37%
256k-read-q1-j1                | 1919              |     1951           |1.67%
4k-read-q128-j1                | 158806            |     345269         |117.42%
256k-read-q128-j1              | 18918             |     23710          |25.33%
4k-write-q1-j1                 | 10120             |     10262          |1.40%
256k-write-q1-j1               | 2779              |     2744           |-1.26%
4k-write-q128-j1               | 47576             |     209236         |339.79%
256k-write-q128-j1             | 9199              |     9337           |1.50%
4k-randread-q1-j2              | 24238             |     25478          |5.12%
256k-randread-q1-j2            | 3656              |     3649           |-0.19%
4k-randread-q128-j2            | 390090            |     577300         |47.99%
256k-randread-q128-j2          | 21992             |     23437          |6.57%
4k-randwrite-q1-j2             | 17096             |     18112          |5.94%
256k-randwrite-q1-j2           | 5188              |     4914           |-5.28%
4k-randwrite-q128-j2           | 143373            |     140560         |-1.96%
256k-randwrite-q128-j2         | 9423              |     9314           |-1.16%
4k-read-q1-j2                  | 36890             |     31768          |-13.88%
256k-read-q1-j2                | 3708              |     4028           |8.63%
4k-read-q128-j2                | 399500            |     409857         |2.59%
256k-read-q128-j2              | 19360             |     21467          |10.88%
4k-write-q1-j2                 | 17786             |     18519          |4.12%
256k-write-q1-j2               | 4756              |     5035           |5.87%
4k-write-q128-j2               | 175756            |     159109         |-9.47%
256k-write-q128-j2             | 9292              |     9293           |0.01%

> On Dec 8, 2023, at 11:54, Ming Lei <[email protected]> wrote:
> 
> On Thu, Dec 07, 2023 at 07:44:37PM -0700, Keith Busch wrote:
>> On Fri, Dec 08, 2023 at 10:00:36AM +0800, Ming Lei wrote:
>>> On Thu, Dec 07, 2023 at 12:31:05PM +0800, Li Feng wrote:
>>>> virtio-blk is generally used in cloud computing scenarios, where the
>>>> performance of virtual disks is very important. The mq-deadline scheduler
>>>> has a big performance drop compared to none with single queue. In my tests,
>>>> mq-deadline 4k readread iops were 270k compared to 450k for none. So here
>>>> the default scheduler of virtio-blk is set to "none".
>>> 
>>> The test result shows you may not test HDD. backing of virtio-blk.
>>> 
>>> none can lose IO merge capability more or less, so probably sequential IO 
>>> perf
>>> drops in case of HDD backing.
>> 
>> More of a curiosity, as I don't immediately even have an HDD to test
>> with! Isn't it more useful for the host providing the backing HDD use an
>> appropriate IO scheduler? virtio-blk has similiarities with a stacking
>> block driver, and we usually don't need to stack IO schedulers.
> 
> dm-rq actually uses IO scheduler at high layer, and early merge has some
> benefits:
> 
> 1) virtio-blk inflight requests are reduced, so less chance to throttle
> inside VM, meantime less IOs(bigger size) are handled by QEMU, and submitted
> to host side queue.
> 
> 2) early merge in VM is cheap than host side, since there can be more block
> IOs originated from different virtio-blk/scsi devices at the same time and
> all images can be stored in single disk, then these IOs become interleaved in
> host side queue, so sequential IO may become random or hard to merge.
> 
> As Jens mentioned, it needs actual test.
> 
> 
> Thanks,
> Ming
> 


Reply via email to