simplebench: compare write request performance

Vladimir Sementsov-Ogievskiy Mon, 13 Jul 2020 05:34:09 -0700

12.07.2020 20:49, Andrey Shinkevich wrote:

The script 'bench_write_req.py' allows comparing performances of write
request for two qemu-img binary files.
An example with (qemu-img binary 1) and without (qemu-img binary 2) the
applied patch "qcow2: skip writing zero buffers to empty COW areas"
(git commit ID: c8bb23cbdbe32f5) has the following results:


SSD:
-----------------  -------------------  -------------------
                    <qemu-img binary 1>  <qemu-img binary 2>
<simple case>      0.34 +- 0.01         10.57 +- 0.96
<general case>     0.33 +- 0.01         9.15 +- 0.85
<cluster middle>   0.33 +- 0.00         8.72 +- 0.05
<cluster overlap>  7.43 +- 1.19         14.35 +- 1.00
-----------------  -------------------  -------------------
HDD:
-----------------  -------------------  -------------------
                    <qemu-img binary 1>  <qemu-img binary 2>
<simple case>      32.61 +- 1.17        55.11 +- 1.15
<general case>     54.28 +- 8.82        60.11 +- 2.76
<cluster middle>   57.93 +- 0.47        58.53 +- 0.51
<cluster overlap>  11.47 +- 0.94        17.29 +- 4.40
-----------------  -------------------  -------------------

Suggested-by: Denis V. Lunev <d...@openvz.org>
Suggested-by: Vladimir Sementsov-Ogievskiy <vsement...@virtuozzo.com>
Signed-off-by: Andrey Shinkevich <andrey.shinkev...@virtuozzo.com>


Andrey wants to drop 02,03 in v5, so this patch is a candidate for v5. Below my 
notes.

---
  scripts/simplebench/bench_write_req.py | 173 +++++++++++++++++++++++++++++++++
  1 file changed, 173 insertions(+)
  create mode 100755 scripts/simplebench/bench_write_req.py

diff --git a/scripts/simplebench/bench_write_req.py 
b/scripts/simplebench/bench_write_req.py
new file mode 100755
index 0000000..a285ef1
--- /dev/null
+++ b/scripts/simplebench/bench_write_req.py
@@ -0,0 +1,173 @@
+#!/usr/bin/env python3
+#
+# Test to compare performance of write requests for two qemu-img binary files.


Let's note that patch is intended to check benefit of c8bb23cbdbe
"qcow2: skip writing zero buffers to empty COW areas"

+#
+# Copyright (c) 2020 Virtuozzo International GmbH.
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+#
+
+
+import sys
+import os
+import subprocess
+import simplebench
+
+
+def bench_func(env, case):
+    """ Handle one "cell" of benchmarking table. """
+    return bench_write_req(env['qemu_img'], env['image_name'],
+                           case['block_size'], case['block_offset'],
+                           case['requests'])
+
+
+def qemu_img_pipe(*args):
+    '''Run qemu-img and return its output'''
+    subp = subprocess.Popen(list(args),
+                            stdout=subprocess.PIPE,
+                            stderr=subprocess.STDOUT,
+                            universal_newlines=True)
+    exitcode = subp.wait()
+    if exitcode < 0:
+        sys.stderr.write('qemu-img received signal %i: %s\n'
+                         % (-exitcode, ' '.join(list(args))))
+    return subp.communicate()[0]
+
+
+def bench_write_req(qemu_img, image_name, block_size, block_offset, requests):
+    """Benchmark write requests
+
+    The function creates a QCOW2 image with the given path/name and fills it

+ with random data optionally.


No, it doesn't fill..

+ Then it runs the 'qemu-img bench' command and
+    makes series of write requests on the image clusters. Finally, it returns
+    the total time of the write operations on the disk.
+
+    qemu_img     -- path to qemu_img executable file
+    image_name   -- QCOW2 image name to create
+    block_size   -- size of a block to write to clusters
+    block_offset -- offset of the block in clusters
+    requests     -- number of write requests per cluster
+
+    Returns {'seconds': int} on success and {'error': str} on failure.
+    Return value is compatible with simplebench lib.
+    """
+
+    if not os.path.isfile(qemu_img):
+        print(f'File not found: {qemu_img}')
+        sys.exit(1)
+
+    image_dir = os.path.dirname(os.path.abspath(image_name))
+    if not os.path.isdir(image_dir):
+        print(f'Path not found: {image_name}')
+        sys.exit(1)
+
+    cluster_size = 1024 * 1024
+    image_size = 1024 * cluster_size
+    seek = 4
+    dd_count = int(image_size / cluster_size) - seek


seek and dd_count are unused

+
+    args_create = [qemu_img, 'create', '-f', 'qcow2', '-o',
+                   f'cluster_size={cluster_size}',
+                   image_name, str(image_size)]
+
+    count = requests * int(image_size / cluster_size)


requests is number of requests per cluster..

+    step = str(cluster_size)


but step is one cluster. So, we have several requests per cluster, but still, 
step is cluster?

It sounds strange to me. Assume requests = 5 and we have image with 5 clusters. 
count would be 5 * 5 = 25. Trying to make 25 iterations with step=cluster will 
go far beyond the image size..

+    offset = str(block_offset)
+    cnt = str(count)


extra variables. Let's just use str() in args below.

+    size = []
+    if block_size:
+        size = ['-s', f'{block_size}']
+
+    args_bench = [qemu_img, 'bench', '-w', '-n', '-t', 'none', '-c', cnt,
+                  '-S', step, '-o', offset, '-f', 'qcow2', image_name]
+    if block_size:
+        args_bench.extend(size)


1. You may just write here

if block_size:
   args_bench += ['-s', f'{block_size}']

No reason to create extra "size" variable to be used only here.

2. Why do you need this logic? If user pass block_size = 0, we instead rely on 
default bufsize of img_bench() which is 4096. And in two test-cases you use 
explicit 4096 constant, and in one you use 0 to be implicitly changed to same 
4096. Let's instead specify block_size explicitly.

+
+    try:
+        qemu_img_pipe(*args_create)
+    except OSError as e:
+        os.remove(image_name)
+        return {'error': 'qemu_img create failed: ' + str(e)}
+
+    try:
+        ret = qemu_img_pipe(*args_bench)
+    finally:
+        os.remove(image_name)
+        if not ret:


ret may be not defined, if exception shot prior to assignment of ret, isn't it?

I suggest to not bother with "finally", and just make similar "except" like for 
previous case, and then just parse ret at function end.

+            return {'error': 'qemu_img bench failed'}
+        if 'seconds' in ret:
+            ret_list = ret.split()
+            index = ret_list.index('seconds.')
+            return {'seconds': float(ret_list[index-1])}
+        else:
+            return {'error': 'qemu_img bench failed: ' + ret}
+
+
+if __name__ == '__main__':
+
+    if len(sys.argv) < 4:
+        program = os.path.basename(sys.argv[0])
+        print(f'USAGE: {program} <path to qemu-img binary file> '
+              '<path to another qemu-img to compare performance with> '
+              '<full or relative name for QCOW2 image to create>')
+        exit(1)
+
+    # Test-cases are "rows" in benchmark resulting table, 'id' is a caption
+    # for the row, other fields are handled by bench_func.
+    test_cases = [
+        {
+            'id': '<simple case>',
+            'block_size': 0,
+            'block_offset': 0,
+            'requests': 10
+        },
+        {
+            'id': '<general case>',
+            'block_size': 4096,
+            'block_offset': 0,
+            'requests': 10
+        },


Hmm I don't understand, why simple case and general case are different? As I 
already noted, if you don't specify -s for bench, as you do if block_size is 0, 
the default value is 4096 anyway in img_bench(). So, how there can be so 
different results in commit message? Or what am I missing?

+        {
+            'id': '<cluster middle>',
+            'block_size': 4096,
+            'block_offset': 524288,
+            'requests': 10
+        },
+        {
+            'id': '<cluster overlap>',


What is overlapping here? you just write half of cluster with a small offset 
from start of cluster. I'm OK with the case itself, I just don't understand the 
naming.

+            'block_size': 524288,
+            'block_offset': 4096,
+            'requests': 2
+        },
+    ]
+
+    # Test-envs are "columns" in benchmark resulting table, 'id is a caption
+    # for the column, other fields are handled by bench_func.
+    # Set the paths below to desired values
+    test_envs = [
+        {
+            'id': '<qemu-img binary 1>',
+            'qemu_img': f'{sys.argv[1]}',
+            'image_name': f'{sys.argv[3]}'
+        },
+        {
+            'id': '<qemu-img binary 2>',
+            'qemu_img': f'{sys.argv[2]}',
+            'image_name': f'{sys.argv[3]}'
+        },
+    ]
+
+    result = simplebench.bench(bench_func, test_envs, test_cases, count=3,
+                               initial_run=False)
+    print(simplebench.ascii(result))



--
Best regards,
Vladimir

Re: [PATCH v4 1/3] scripts/simplebench: compare write request performance

Reply via email to