Re: [Qemu-devel] [PATCH 0/7] qcow2: async handling of fragmented io

Vladimir Sementsov-Ogievskiy Mon, 17 Sep 2018 08:35:56 -0700

ping. Finally, what about this?

07.08.2018 20:43, Vladimir Sementsov-Ogievskiy wrote:

Hi all!


Here is an asynchronous scheme for handling fragmented qcow2
reads and writes. Both qcow2 read and write functions loops through
sequential portions of data. The series aim it to parallelize these
loops iterations.

It improves performance for fragmented qcow2 images, I've tested it
as follows:

I have four 4G qcow2 images (with default 64k block size) on my ssd disk:
t-seq.qcow2 - sequentially written qcow2 image
t-reverse.qcow2 - filled by writing 64k portions from end to the start
t-rand.qcow2 - filled by writing 64k portions (aligned) in random order
t-part-rand.qcow2 - filled by shuffling order of 64k writes in 1m clusters
(see source code of image generation in the end for details)

and the test (sequential io by 1mb chunks):

test write:
     for t in /ssd/t-*; \
         do sync; echo 1 > /proc/sys/vm/drop_caches; echo ===  $t  ===; \
         ./qemu-img bench -c 4096 -d 1 -f qcow2 -n -s 1m -t none -w $t; \
     done

test read (same, just drop -w parameter):
     for t in /ssd/t-*; \
         do sync; echo 1 > /proc/sys/vm/drop_caches; echo ===  $t  ===; \
         ./qemu-img bench -c 4096 -d 1 -f qcow2 -n -s 1m -t none $t; \
     done

short info about parameters:
   -w - do writes (otherwise do reads)
   -c - count of blocks
   -s - block size
   -t none - disable cache
   -n - native aio
   -d 1 - don't use parallel requests provided by qemu-img bench itself

results:
     +-----------+-----------+----------+-----------+----------+
     |   file    | wr before | wr after | rd before | rd after |
     +-----------+-----------+----------+-----------+----------+
     | seq       |     8.605 |    8.636 |     9.043 |    9.010 |
     | reverse   |     9.934 |    8.654 |    17.162 |    8.662 |
     | rand      |     9.983 |    8.687 |    19.775 |    9.010 |
     | part-rand |     9.871 |    8.650 |    14.241 |    8.669 |
     +-----------+-----------+----------+-----------+----------+

Performance gain is obvious, especially for read.

how images are generated:

  === gen-writes file ===
     #!/usr/bin/env python
     import random
     import sys

     size = 4 * 1024 * 1024 * 1024
     block = 64 * 1024
     block2 = 1024 * 1024

     arg = sys.argv[1]

     if arg in ('rand', 'reverse', 'seq'):
         writes = list(range(0, size, block))

     if arg == 'rand':
         random.shuffle(writes)
     elif arg == 'reverse':
         writes.reverse()
     elif arg == 'part-rand':
         writes = []
         for off in range(0, size, block2):
             wr = list(range(off, off + block2, block))
             random.shuffle(wr)
             writes.extend(wr)
     elif arg != 'seq':
         sys.exit(1)

     for w in writes:
         print 'write -P 0xff {} {}'.format(w, block)

     print 'q'


  === gen-test-images.sh file ===
     #!/bin/bash

     IMG_PATH=/ssd

     for name in seq reverse rand part-rand; do
         IMG=$IMG_PATH/t-$name.qcow2
         echo createing $IMG ...
         rm -f $IMG
         qemu-img create -f qcow2 $IMG 4G
         gen-writes $name | qemu-io $IMG
     done

Denis V. Lunev (1):
   qcow2: move qemu_co_mutex_lock below decryption procedure

Vladimir Sementsov-Ogievskiy (6):
   qcow2: bdrv_co_pwritev: move encryption code out of lock
   qcow2: split out reading normal clusters from qcow2_co_preadv
   qcow2: async scheme for qcow2_co_preadv
   qcow2: refactor qcow2_co_pwritev: split out qcow2_co_do_pwritev
   qcow2: refactor qcow2_co_pwritev locals scope
   qcow2: async scheme for qcow2_co_pwritev

  block/qcow2.c                      | 506 +++++++++++++++++++++++++++++--------
  tests/qemu-iotests/026.out         |  18 +-
  tests/qemu-iotests/026.out.nocache |  20 +-
  3 files changed, 415 insertions(+), 129 deletions(-)



--
Best regards,
Vladimir

Re: [Qemu-devel] [PATCH 0/7] qcow2: async handling of fragmented io

Reply via email to