On Wed, Oct 17, 2018 at 02:22:23PM -0700, Mike Larkin wrote:
> On Fri, Oct 05, 2018 at 12:07:13PM +0200, Sergio Lopez wrote:
> > Hi,
> > 
> > This patch implements a mechanism to allow users register an I/O port
> > with a special file descriptor (kickfd) which can be monitored for
> > events using kevent. The kernel will note an event each time the Guest
> > writes to an I/O port registered with a kickfd.
> 
> > 
> > This is mainly intended for kicking virtio virtqueues, and allows the
> > VCPU to return immediately to the Guest, instead of having to exit all
> > the way to vmd to process the request. With kickfd, the actual work
> > (i.e. an I/O operation) requested by the Guest can be done in a
> > parallel thread that eventually asserts the corresponding IRQ.
> 
> The reason our virtio is so slow today is, as you point out, we process
> the I/O sequentially with the queue notify back in vmd. To this end, there
> are three "performance overhead components" involved:
> 
> 1. the amount of time it takes to return from the vmm run ioctl back to
>    vmd
> 2. the time it takes to process the events in the queue
> 3. the time it takes to re-enter the vm back to the point after the OUT
>    instruction that did the queue notify.
> 
> I've benchmarked this before; the biggest performance penalty here is in
> #2. The exit and re-entry overheads are miniscule compared to the overhead
> of the I/O itself.

I think that, in addition to the absolute number of cycles spent on each
approach, we should also consider where are being spent.

The kickfd approach is probably more expensive (ignoring TLB flushes,
which may end making it cheaper) than having the vCPU do the actual I/O,
but it allows the latter to return much earlier to the Guest, resulting
in a significantly lower submission latency and more CPU time available
for the VM.

> Optimizing #2 is therefore the most bang for the buck. We (dlg@ and I)
> had a taskq-based diff to do this previously (I'm not sure what state it
> was left in). In that diff, the I/O was processed in a set of worker threads,
> effectively eliminating the overhead for #2.

I had to partially rewrite the vioblk_notifyq function, as it didn't
play nice with concurrent access (Guest + iothread), but I haven't
implemented any optimization for #2.

There's the option of merging requests and using working threads to
avoid having the iothread actively waiting for the I/O, but this
would significantly increase the complexity of vmd's virtio code, so
must be done with care (it's the problem behind *many* qemu issues).

> Your diff (if I read it correctly), optimizes all 3 cases, and that's good.
> I'll admit it's not a way I thought of solving the performance issue, and
> I'm not sure if this is the right way or if the taskq way is better. The
> taskq way is a bit cleaner and easier to understand (there is no separate
> backchannel from vmm->vmd, but it doesn't look too crazy from what I
> read below).
> 
> Do you have some performance measurements with this diff applied vs without?

Yes, I ran some random read and random write tests with fio. The Guest
is Linux, mainly because its virtio implementation is more mature. Next
week I'll try with OpenBSD.

In a general sense, the kickfd approach is similar or slightly better
than the current one when generating I/O from just one process, and
significantly better (up to 2x the throughput) with 2 and 4 processes.

The test machine is an AMD Phenom II X4 (yeah, not the best thing for
this kind of tests, but the only non-virtual environment I have here)
with a Samsung SSD Evo 850. OpenBSD's kernel is running with HZ=1000 and
Linux's with HZ=100 (otherwise I couldn't get reliable numbers due to
missing ticks). The target device is backed by a raw image.

I'm pasting the raw output for fio below.

Sergio (slp).


==================
| Linux + kickfd |
==================

 - randread

[root@localhost ~]# fio --clocksource=clock_gettime --name=randread 
--rw=randread --bs=4k --filename=/dev/vdb --size=1g --numjobs=1 --ioengine=sync 
--iodepth=1 --direct=1 --group_reporting 
randread: (g=0): rw=randread, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 
4096B-4096B, ioengine=sync, iodepth=1
fio-3.3
Starting 1 process
Jobs: 1 (f=1): [r(1)][95.7%][r=49.3MiB/s,w=0KiB/s][r=12.6k,w=0 IOPS][eta 
00m:01s]
randread: (groupid=0, jobs=1): err= 0: pid=318: Fri Oct 19 18:36:02 2018
   read: IOPS=11.5k, BW=44.8MiB/s (46.0MB/s)(1024MiB/22850msec)
    clat (nsec): min=0, max=20000k, avg=85372.92, stdev=924160.90
     lat (nsec): min=0, max=20000k, avg=85525.51, stdev=924971.98
    clat percentiles (nsec):
     |  1.00th=[       0],  5.00th=[       0], 10.00th=[       0],
     | 20.00th=[       0], 30.00th=[       0], 40.00th=[       0],
     | 50.00th=[       0], 60.00th=[       0], 70.00th=[       0],
     | 80.00th=[       0], 90.00th=[       0], 95.00th=[       0],
     | 99.00th=[       0], 99.50th=[10027008], 99.90th=[10027008],
     | 99.95th=[10027008], 99.99th=[10027008]
   bw (  KiB/s): min=39992, max=54880, per=99.79%, avg=45794.67, stdev=3242.81, 
samples=45
   iops        : min= 9998, max=13720, avg=11448.67, stdev=810.70, samples=45
  lat (nsec)   : 2=99.15%
  lat (msec)   : 20=0.85%, 50=0.01%
  cpu          : usr=2.94%, sys=6.35%, ctx=262317, majf=0, minf=9
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=262144,0,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
   READ: bw=44.8MiB/s (46.0MB/s), 44.8MiB/s-44.8MiB/s (46.0MB/s-46.0MB/s), 
io=1024MiB (1074MB), run=22850-22850msec

Disk stats (read/write):
  vdb: ios=259648/0, merge=0/0, ticks=15960/0, in_queue=15920, util=69.98%
[root@localhost ~]# fio --clocksource=clock_gettime --name=randread 
--rw=randread --bs=4k --filename=/dev/vdb --size=1g --numjobs=2 --ioengine=sync 
--iodepth=1 --direct=1 --group_reporting 
randread: (g=0): rw=randread, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 
4096B-4096B, ioengine=sync, iodepth=1
...
fio-3.3
Starting 2 processes
Jobs: 2 (f=2): [r(2)][100.0%][r=87.2MiB/s,w=0KiB/s][r=22.3k,w=0 IOPS][eta 
00m:00s]
randread: (groupid=0, jobs=2): err= 0: pid=321: Fri Oct 19 18:36:33 2018
   read: IOPS=19.8k, BW=77.4MiB/s (81.1MB/s)(2048MiB/26470msec)
    clat (nsec): min=0, max=20000k, avg=86078.64, stdev=923991.34
     lat (nsec): min=0, max=20000k, avg=89321.14, stdev=941071.87
    clat percentiles (nsec):
     |  1.00th=[       0],  5.00th=[       0], 10.00th=[       0],
     | 20.00th=[       0], 30.00th=[       0], 40.00th=[       0],
     | 50.00th=[       0], 60.00th=[       0], 70.00th=[       0],
     | 80.00th=[       0], 90.00th=[       0], 95.00th=[       0],
     | 99.00th=[       0], 99.50th=[10027008], 99.90th=[10027008],
     | 99.95th=[10027008], 99.99th=[10027008]
   bw (  KiB/s): min=29968, max=51960, per=49.93%, avg=39555.69, stdev=4891.27, 
samples=104
   iops        : min= 7492, max=12990, avg=9888.92, stdev=1222.82, samples=104
  lat (nsec)   : 2=99.14%
  lat (msec)   : 20=0.86%, 50=0.01%
  cpu          : usr=6.64%, sys=20.04%, ctx=651861, majf=0, minf=18
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=524288,0,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
   READ: bw=77.4MiB/s (81.1MB/s), 77.4MiB/s-77.4MiB/s (81.1MB/s-81.1MB/s), 
io=2048MiB (2147MB), run=26470-26470msec

Disk stats (read/write):
  vdb: ios=522878/0, merge=0/0, ticks=22900/0, in_queue=20600, util=57.40%
[root@localhost ~]# fio --clocksource=clock_gettime --name=randread 
--rw=randread --bs=4k --filename=/dev/vdb --size=1g --numjobs=4 --ioengine=sync 
--iodepth=1 --direct=1 --group_reporting 
randread: (g=0): rw=randread, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 
4096B-4096B, ioengine=sync, iodepth=1
...
fio-3.3
Starting 4 processes
Jobs: 3 (f=3): [r(1),_(1),r(2)][100.0%][r=98.1MiB/s,w=0KiB/s][r=25.1k,w=0 
IOPS][eta 00m:00s]
randread: (groupid=0, jobs=4): err= 0: pid=325: Fri Oct 19 18:37:27 2018
   read: IOPS=21.8k, BW=85.2MiB/s (89.4MB/s)(4096MiB/48060msec)
    clat (nsec): min=0, max=90000k, avg=139427.19, stdev=1469205.06
     lat (nsec): min=0, max=90000k, avg=147285.46, stdev=1541501.85
    clat percentiles (nsec):
     |  1.00th=[       0],  5.00th=[       0], 10.00th=[       0],
     | 20.00th=[       0], 30.00th=[       0], 40.00th=[       0],
     | 50.00th=[       0], 60.00th=[       0], 70.00th=[       0],
     | 80.00th=[       0], 90.00th=[       0], 95.00th=[       0],
     | 99.00th=[10027008], 99.50th=[10027008], 99.90th=[20054016],
     | 99.95th=[30015488], 99.99th=[40108032]
   bw (  KiB/s): min=12472, max=33808, per=25.05%, avg=21861.49, stdev=4136.28, 
samples=382
   iops        : min= 3118, max= 8452, avg=5465.37, stdev=1034.07, samples=382
  lat (nsec)   : 2=98.85%
  lat (msec)   : 20=0.99%, 50=0.14%, 100=0.01%
  cpu          : usr=5.89%, sys=13.63%, ctx=1145850, majf=0, minf=32
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=1048576,0,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
   READ: bw=85.2MiB/s (89.4MB/s), 85.2MiB/s-85.2MiB/s (89.4MB/s-89.4MB/s), 
io=4096MiB (4295MB), run=48060-48060msec

Disk stats (read/write):
  vdb: ios=1045423/0, merge=0/0, ticks=38510/0, in_queue=33730, util=55.81%


 - randwrite

[root@localhost ~]# fio --clocksource=clock_gettime --name=randwrite 
--rw=randwrite --bs=4k --filename=/dev/vdb --size=1g --numjobs=1 
--ioengine=sync --iodepth=1 --direct=1 --group_reporting 
randwrite: (g=0): rw=randwrite, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 
4096B-4096B, ioengine=sync, iodepth=1
fio-3.3
Starting 1 process
Jobs: 1 (f=1): [w(1)][100.0%][r=0KiB/s,w=48.4MiB/s][r=0,w=12.4k IOPS][eta 
00m:00s]
randwrite: (groupid=0, jobs=1): err= 0: pid=333: Fri Oct 19 18:38:58 2018
  write: IOPS=8711, BW=34.0MiB/s (35.7MB/s)(1024MiB/30090msec)
    clat (nsec): min=0, max=480000k, avg=113029.48, stdev=1545011.16
     lat (nsec): min=0, max=480000k, avg=113296.51, stdev=1545855.54
    clat percentiles (nsec):
     |  1.00th=[       0],  5.00th=[       0], 10.00th=[       0],
     | 20.00th=[       0], 30.00th=[       0], 40.00th=[       0],
     | 50.00th=[       0], 60.00th=[       0], 70.00th=[       0],
     | 80.00th=[       0], 90.00th=[       0], 95.00th=[       0],
     | 99.00th=[10027008], 99.50th=[10027008], 99.90th=[10027008],
     | 99.95th=[10027008], 99.99th=[10027008]
   bw (  KiB/s): min=12168, max=62920, per=99.83%, avg=34786.27, 
stdev=10151.04, samples=60
   iops        : min= 3042, max=15730, avg=8696.57, stdev=2537.76, samples=60
  lat (nsec)   : 2=98.93%
  lat (msec)   : 20=1.07%, 50=0.01%, 100=0.01%, 250=0.01%, 500=0.01%
  cpu          : usr=2.83%, sys=5.57%, ctx=262341, majf=0, minf=7
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=0,262144,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
  WRITE: bw=34.0MiB/s (35.7MB/s), 34.0MiB/s-34.0MiB/s (35.7MB/s-35.7MB/s), 
io=1024MiB (1074MB), run=30090-30090msec

Disk stats (read/write):
  vdb: ios=0/259681, merge=0/0, ticks=0/22530, in_queue=22510, util=75.06%
[root@localhost ~]# fio --clocksource=clock_gettime --name=randwrite 
--rw=randwrite --bs=4k --filename=/dev/vdb --size=1g --numjobs=2 
--ioengine=sync --iodepth=1 --direct=1 --group_reporting 
randwrite: (g=0): rw=randwrite, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 
4096B-4096B, ioengine=sync, iodepth=1
...
fio-3.3
Starting 2 processes
Jobs: 2 (f=2): [w(2)][100.0%][r=0KiB/s,w=55.7MiB/s][r=0,w=14.3k IOPS][eta 
00m:00s]
randwrite: (groupid=0, jobs=2): err= 0: pid=349: Fri Oct 19 18:53:20 2018
  write: IOPS=8701, BW=33.0MiB/s (35.6MB/s)(2048MiB/60250msec)
    clat (nsec): min=0, max=510000k, avg=227813.72, stdev=2009344.19
     lat (nsec): min=0, max=510000k, avg=228118.90, stdev=2010068.83
    clat percentiles (nsec):
     |  1.00th=[       0],  5.00th=[       0], 10.00th=[       0],
     | 20.00th=[       0], 30.00th=[       0], 40.00th=[       0],
     | 50.00th=[       0], 60.00th=[       0], 70.00th=[       0],
     | 80.00th=[       0], 90.00th=[       0], 95.00th=[       0],
     | 99.00th=[10027008], 99.50th=[10027008], 99.90th=[10027008],
     | 99.95th=[20054016], 99.99th=[20054016]
   bw (  KiB/s): min= 1192, max=37008, per=49.79%, avg=17331.20, stdev=5263.35, 
samples=240
   iops        : min=  298, max= 9252, avg=4332.80, stdev=1315.84, samples=240
  lat (nsec)   : 2=97.86%
  lat (msec)   : 20=2.05%, 50=0.08%, 100=0.01%, 250=0.01%, 500=0.01%
  lat (msec)   : 750=0.01%
  cpu          : usr=1.10%, sys=5.88%, ctx=534510, majf=0, minf=17
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=0,524288,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
  WRITE: bw=33.0MiB/s (35.6MB/s), 33.0MiB/s-33.0MiB/s (35.6MB/s-35.6MB/s), 
io=2048MiB (2147MB), run=60250-60250msec

Disk stats (read/write):
  vdb: ios=42/522494, merge=0/0, ticks=0/76720, in_queue=74240, util=67.04%
[root@localhost ~]# fio --clocksource=clock_gettime --name=randwrite 
--rw=randwrite --bs=4k --filename=/dev/vdb --size=1g --numjobs=4 
--ioengine=sync --iodepth=1 --direct=1 --group_reporting 
randwrite: (g=0): rw=randwrite, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 
4096B-4096B, ioengine=sync, iodepth=1
...
fio-3.3
Starting 4 processes
Jobs: 2 (f=2): [_(1),w(1),_(1),w(1)][100.0%][r=0KiB/s,w=50.0MiB/s][r=0,w=13.1k 
IOPS][eta 00m:00s]
randwrite: (groupid=0, jobs=4): err= 0: pid=356: Fri Oct 19 19:01:49 2018
  write: IOPS=8327, BW=32.5MiB/s (34.1MB/s)(4096MiB/125920msec)
    clat (nsec): min=0, max=480000k, avg=474901.20, stdev=2864013.52
     lat (nsec): min=0, max=480000k, avg=475769.04, stdev=2865517.37
    clat percentiles (nsec):
     |  1.00th=[       0],  5.00th=[       0], 10.00th=[       0],
     | 20.00th=[       0], 30.00th=[       0], 40.00th=[       0],
     | 50.00th=[       0], 60.00th=[       0], 70.00th=[       0],
     | 80.00th=[       0], 90.00th=[       0], 95.00th=[       0],
     | 99.00th=[10027008], 99.50th=[10027008], 99.90th=[20054016],
     | 99.95th=[20054016], 99.99th=[50069504]
   bw (  KiB/s): min=  304, max=21072, per=24.96%, avg=8314.76, stdev=1952.91, 
samples=1001
   iops        : min=   76, max= 5268, avg=2078.69, stdev=488.23, samples=1001
  lat (nsec)   : 2=95.74%
  lat (msec)   : 20=3.93%, 50=0.32%, 100=0.01%, 250=0.01%, 500=0.01%
  cpu          : usr=1.21%, sys=3.35%, ctx=1065530, majf=0, minf=37
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=0,1048576,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
  WRITE: bw=32.5MiB/s (34.1MB/s), 32.5MiB/s-32.5MiB/s (34.1MB/s-34.1MB/s), 
io=4096MiB (4295MB), run=125920-125920msec

Disk stats (read/write):
  vdb: ios=192/1048135, merge=0/0, ticks=0/228530, in_queue=224790, util=52.58%


=====================
| Linux + NO kickfd |
=====================

 - randread

[root@localhost ~]# fio --clocksource=clock_gettime --name=randread 
--rw=randread --bs=4k --filename=/dev/vdb --size=1g --numjobs=1 --ioengine=sync 
--iodepth=1 --direct=1 --group_reporting 
randread: (g=0): rw=randread, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 
4096B-4096B, ioengine=sync, iodepth=1
fio-3.3
Starting 1 process
Jobs: 1 (f=1): [r(1)][100.0%][r=52.7MiB/s,w=0KiB/s][r=13.5k,w=0 IOPS][eta 
00m:00s]
randread: (groupid=0, jobs=1): err= 0: pid=317: Fri Oct 19 19:09:34 2018
   read: IOPS=10.2k, BW=40.0MiB/s (41.0MB/s)(1024MiB/25590msec)
    clat (nsec): min=0, max=30000k, avg=95672.61, stdev=974610.32
     lat (nsec): min=0, max=30000k, avg=95939.64, stdev=975953.07
    clat percentiles (nsec):
     |  1.00th=[       0],  5.00th=[       0], 10.00th=[       0],
     | 20.00th=[       0], 30.00th=[       0], 40.00th=[       0],
     | 50.00th=[       0], 60.00th=[       0], 70.00th=[       0],
     | 80.00th=[       0], 90.00th=[       0], 95.00th=[       0],
     | 99.00th=[       0], 99.50th=[10027008], 99.90th=[10027008],
     | 99.95th=[10027008], 99.99th=[10027008]
   bw (  KiB/s): min=32944, max=55792, per=99.84%, avg=40909.80, stdev=7471.07, 
samples=51
   iops        : min= 8236, max=13948, avg=10227.45, stdev=1867.77, samples=51
  lat (nsec)   : 2=99.04%
  lat (msec)   : 20=0.96%, 50=0.01%
  cpu          : usr=2.14%, sys=7.66%, ctx=262334, majf=0, minf=9
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=262144,0,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
   READ: bw=40.0MiB/s (41.0MB/s), 40.0MiB/s-40.0MiB/s (41.0MB/s-41.0MB/s), 
io=1024MiB (1074MB), run=25590-25590msec

Disk stats (read/write):
  vdb: ios=259365/0, merge=0/0, ticks=18070/0, in_queue=17990, util=70.55%
[root@localhost ~]# fio --clocksource=clock_gettime --name=randread 
--rw=randread --bs=4k --filename=/dev/vdb --size=1g --numjobs=2 --ioengine=sync 
--iodepth=1 --direct=1 --group_reporting 
randread: (g=0): rw=randread, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 
4096B-4096B, ioengine=sync, iodepth=1
...
fio-3.3
Starting 2 processes
Jobs: 1 (f=1): [_(1),r(1)][100.0%][r=49.8MiB/s,w=0KiB/s][r=12.7k,w=0 IOPS][eta 
00m:00s]
randread: (groupid=0, jobs=2): err= 0: pid=320: Fri Oct 19 19:10:45 2018
   read: IOPS=10.3k, BW=40.2MiB/s (42.1MB/s)(2048MiB/50960msec)
    clat (nsec): min=0, max=420000k, avg=170440.67, stdev=3269072.72
     lat (nsec): min=0, max=420000k, avg=171890.26, stdev=3297000.00
    clat percentiles (nsec):
     |  1.00th=[        0],  5.00th=[        0], 10.00th=[        0],
     | 20.00th=[        0], 30.00th=[        0], 40.00th=[        0],
     | 50.00th=[        0], 60.00th=[        0], 70.00th=[        0],
     | 80.00th=[        0], 90.00th=[        0], 95.00th=[        0],
     | 99.00th=[ 10027008], 99.50th=[ 10027008], 99.90th=[ 10027008],
     | 99.95th=[ 40108032], 99.99th=[170917888]
   bw (  KiB/s): min= 4288, max=54864, per=50.10%, avg=20618.21, stdev=7533.43, 
samples=199
   iops        : min= 1072, max=13716, avg=5154.55, stdev=1883.36, samples=199
  lat (nsec)   : 2=98.87%
  lat (msec)   : 20=1.05%, 50=0.02%, 100=0.02%, 250=0.02%, 500=0.01%
  cpu          : usr=1.25%, sys=3.89%, ctx=524669, majf=0, minf=19
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=524288,0,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
   READ: bw=40.2MiB/s (42.1MB/s), 40.2MiB/s-40.2MiB/s (42.1MB/s-42.1MB/s), 
io=2048MiB (2147MB), run=50960-50960msec

Disk stats (read/write):
  vdb: ios=522156/0, merge=0/0, ticks=35980/0, in_queue=35910, util=70.41%
[root@localhost ~]# fio --clocksource=clock_gettime --name=randread 
--rw=randread --bs=4k --filename=/dev/vdb --size=1g --numjobs=4 --ioengine=sync 
--iodepth=1 --direct=1 --group_reporting 
randread: (g=0): rw=randread, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 
4096B-4096B, ioengine=sync, iodepth=1
...
fio-3.3
Starting 4 processes
Jobs: 1 (f=1): [r(1),_(3)][96.4%][r=42.2MiB/s,w=0KiB/s][r=10.8k,w=0 IOPS][eta 
00m:04s]          
randread: (groupid=0, jobs=4): err= 0: pid=324: Fri Oct 19 19:13:10 2018
   read: IOPS=9739, BW=38.0MiB/s (39.9MB/s)(4096MiB/107660msec)
    clat (nsec): min=0, max=720000k, avg=337715.15, stdev=7599605.74
     lat (nsec): min=0, max=720000k, avg=342531.20, stdev=7692655.76
    clat percentiles (nsec):
     |  1.00th=[        0],  5.00th=[        0], 10.00th=[        0],
     | 20.00th=[        0], 30.00th=[        0], 40.00th=[        0],
     | 50.00th=[        0], 60.00th=[        0], 70.00th=[        0],
     | 80.00th=[        0], 90.00th=[        0], 95.00th=[        0],
     | 99.00th=[ 10027008], 99.50th=[ 10027008], 99.90th=[ 10027008],
     | 99.95th=[189792256], 99.99th=[371195904]
   bw (  KiB/s): min=   24, max=36572, per=25.32%, avg=9865.00, stdev=6204.61, 
samples=844
   iops        : min=    6, max= 9143, avg=2466.25, stdev=1551.15, samples=844
  lat (nsec)   : 2=98.49%
  lat (msec)   : 20=1.43%, 50=0.01%, 100=0.01%, 250=0.04%, 500=0.03%
  lat (msec)   : 750=0.01%
  cpu          : usr=0.78%, sys=2.01%, ctx=1049502, majf=0, minf=28
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=1048576,0,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
   READ: bw=38.0MiB/s (39.9MB/s), 38.0MiB/s-38.0MiB/s (39.9MB/s-39.9MB/s), 
io=4096MiB (4295MB), run=107660-107660msec

Disk stats (read/write):
  vdb: ios=1048372/0, merge=0/0, ticks=75690/0, in_queue=75410, util=69.99%


 - randwrite

[root@localhost ~]# fio --clocksource=clock_gettime --name=randwrite 
--rw=randwrite --bs=4k --filename=/dev/vdb --size=1g --numjobs=1 
--ioengine=sync --iodepth=1 --direct=1 --group_reporting 
randwrite: (g=0): rw=randwrite, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 
4096B-4096B, ioengine=sync, iodepth=1
fio-3.3
Starting 1 process
Jobs: 1 (f=1): [w(1)][100.0%][r=0KiB/s,w=45.3MiB/s][r=0,w=11.6k IOPS][eta 
00m:00s]
randwrite: (groupid=0, jobs=1): err= 0: pid=332: Fri Oct 19 19:15:24 2018
  write: IOPS=8351, BW=32.6MiB/s (34.2MB/s)(1024MiB/31390msec)
    clat (nsec): min=0, max=10000k, avg=117263.79, stdev=1076518.26
     lat (nsec): min=0, max=10000k, avg=117492.68, stdev=1077555.87
    clat percentiles (nsec):
     |  1.00th=[       0],  5.00th=[       0], 10.00th=[       0],
     | 20.00th=[       0], 30.00th=[       0], 40.00th=[       0],
     | 50.00th=[       0], 60.00th=[       0], 70.00th=[       0],
     | 80.00th=[       0], 90.00th=[       0], 95.00th=[       0],
     | 99.00th=[10027008], 99.50th=[10027008], 99.90th=[10027008],
     | 99.95th=[10027008], 99.99th=[10027008]
   bw (  KiB/s): min=21184, max=57256, per=100.00%, avg=33411.10, 
stdev=9754.72, samples=62
   iops        : min= 5296, max=14314, avg=8352.77, stdev=2438.68, samples=62
  lat (nsec)   : 2=98.83%
  lat (msec)   : 20=1.17%
  cpu          : usr=2.73%, sys=6.29%, ctx=262343, majf=0, minf=9
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=0,262144,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
  WRITE: bw=32.6MiB/s (34.2MB/s), 32.6MiB/s-32.6MiB/s (34.2MB/s-34.2MB/s), 
io=1024MiB (1074MB), run=31390-31390msec

Disk stats (read/write):
  vdb: ios=41/262139, merge=0/0, ticks=0/22850, in_queue=22810, util=72.44%
[root@localhost ~]# fio --clocksource=clock_gettime --name=randwrite 
--rw=randwrite --bs=4k --filename=/dev/vdb --size=1g --numjobs=2 
--ioengine=sync --iodepth=1 --direct=1 --group_reporting 
randwrite: (g=0): rw=randwrite, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 
4096B-4096B, ioengine=sync, iodepth=1
...
fio-3.3
Starting 2 processes
Jobs: 1 (f=1): [_(1),w(1)][98.5%][r=0KiB/s,w=43.1MiB/s][r=0,w=11.0k IOPS][eta 
00m:01s]
randwrite: (groupid=0, jobs=2): err= 0: pid=338: Fri Oct 19 19:17:46 2018
  write: IOPS=8026, BW=31.4MiB/s (32.9MB/s)(2048MiB/65320msec)
    clat (nsec): min=0, max=550000k, avg=217380.52, stdev=4334253.25
     lat (nsec): min=0, max=550000k, avg=221443.18, stdev=4422072.98
    clat percentiles (nsec):
     |  1.00th=[        0],  5.00th=[        0], 10.00th=[        0],
     | 20.00th=[        0], 30.00th=[        0], 40.00th=[        0],
     | 50.00th=[        0], 60.00th=[        0], 70.00th=[        0],
     | 80.00th=[        0], 90.00th=[        0], 95.00th=[        0],
     | 99.00th=[ 10027008], 99.50th=[ 10027008], 99.90th=[ 10027008],
     | 99.95th=[ 40108032], 99.99th=[250609664]
   bw (  KiB/s): min=  904, max=55968, per=50.25%, avg=16133.32, stdev=7388.58, 
samples=257
   iops        : min=  226, max=13992, avg=4033.33, stdev=1847.14, samples=257
  lat (nsec)   : 2=98.57%
  lat (msec)   : 20=1.35%, 50=0.04%, 100=0.02%, 250=0.02%, 500=0.01%
  lat (msec)   : 750=0.01%
  cpu          : usr=1.46%, sys=2.80%, ctx=524717, majf=0, minf=19
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=0,524288,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
  WRITE: bw=31.4MiB/s (32.9MB/s), 31.4MiB/s-31.4MiB/s (32.9MB/s-32.9MB/s), 
io=2048MiB (2147MB), run=65320-65320msec

Disk stats (read/write):
  vdb: ios=156/521506, merge=0/0, ticks=0/48460, in_queue=48360, util=74.13%
[root@localhost ~]# fio --clocksource=clock_gettime --name=randwrite 
--rw=randwrite --bs=4k --filename=/dev/vdb --size=1g --numjobs=4 
--ioengine=sync --iodepth=1 --direct=1 --group_reporting 
randwrite: (g=0): rw=randwrite, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 
4096B-4096B, ioengine=sync, iodepth=1
...
fio-3.3
Starting 4 processes
Jobs: 1 (f=1): [_(2),w(1),_(1)][98.7%][r=0KiB/s,w=34.7MiB/s][r=0,w=8871 
IOPS][eta 00m:02s]
randwrite: (groupid=0, jobs=4): err= 0: pid=346: Fri Oct 19 19:20:28 2018
  write: IOPS=7088, BW=27.7MiB/s (29.0MB/s)(4096MiB/147920msec)
    clat (nsec): min=0, max=980000k, avg=467662.81, stdev=10642801.11
     lat (nsec): min=0, max=980000k, avg=479707.72, stdev=10920279.83
    clat percentiles (nsec):
     |  1.00th=[        0],  5.00th=[        0], 10.00th=[        0],
     | 20.00th=[        0], 30.00th=[        0], 40.00th=[        0],
     | 50.00th=[        0], 60.00th=[        0], 70.00th=[        0],
     | 80.00th=[        0], 90.00th=[        0], 95.00th=[        0],
     | 99.00th=[ 10027008], 99.50th=[ 10027008], 99.90th=[ 20054016],
     | 99.95th=[229638144], 99.99th=[557842432]
   bw (  KiB/s): min=   16, max=45592, per=26.28%, avg=7450.62, stdev=5314.15, 
samples=1118
   iops        : min=    4, max=11398, avg=1862.65, stdev=1328.54, samples=1118
  lat (nsec)   : 2=97.88%
  lat (msec)   : 20=1.99%, 50=0.04%, 100=0.01%, 250=0.03%, 500=0.03%
  lat (msec)   : 750=0.01%, 1000=0.01%
  cpu          : usr=0.63%, sys=1.31%, ctx=1049437, majf=0, minf=37
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=0,1048576,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
  WRITE: bw=27.7MiB/s (29.0MB/s), 27.7MiB/s-27.7MiB/s (29.0MB/s-29.0MB/s), 
io=4096MiB (4295MB), run=147920-147920msec

Disk stats (read/write):
  vdb: ios=595/1043793, merge=0/0, ticks=50/116060, in_queue=111220, util=75.35%

Reply via email to