On Tue, May 12, 2015 at 05:30:54PM +0300, Denis V. Lunev wrote: > I have used the following program to test > #define _GNU_SOURCE > > #include <stdio.h> > #include <unistd.h> > #include <fcntl.h> > #include <sys/types.h> > #include <malloc.h> > #include <string.h> > > int main(int argc, char *argv[]) > { > int fd = open(argv[1], O_RDWR | O_CREAT | O_DIRECT, 0644); > void *buf; > int i = 0, align = atoi(argv[2]); > > do { > buf = memalign(align, 4096); > if (align >= 4096) > break; > if ((unsigned long)buf & 4095) > break; > i++; > } while (1); > printf("%d %p\n", i, buf); > > memset(buf, 0x11, 4096); > > for (i = 0; i < 100000; i++) { > lseek(fd, SEEK_CUR, 4096); > write(fd, buf, 4096); > } > > close(fd); > return 0; > } > for in in `seq 1 30` ; do a.out aa ; done > > The file was placed into 8 GB partition on HDD below to avoid speed > change due to different offset on disk. Results are reliable: > - 189 vs 180 seconds on Linux 3.16 > > The following setups have been tested: > 1) ext4 with block size equals to 1024 over 512/512 physical/logical > sector size SSD disk > 2) ext4 with block size equals to 4096 over 512/512 physical/logical > sector size SSD disk > 3) ext4 with block size equals to 4096 over 512/4096 physical/logical > sector size rotational disk (WDC WD20EZRX) > 4) xfs with block size equals to 4096 over 512/512 physical/logical > sector size SSD disk > > The difference is quite reliable and the same 5%. > qemu-io -n -c 'write -P 0xaa 0 1G' 1.img > for image in qcow2 format is 1% faster. > > qemu-img is also affected. The difference in between > qemu-img create -f qcow2 1.img 64G > qemu-io -n -c 'write -P 0xaa 0 1G' 1.img > time for i in `seq 1 30` ; do qemu-img convert 1.img -t none -O raw 2.img ; > rm -rf 2.img ; done > is around 126 vs 119 seconds. > > The justification of the performance improve is quite interesting. > From the kernel point of view each request to the disk was split > by two. This could be seen by blktrace like this: > 9,0 11 1 0.000000000 11151 Q WS 312737792 + 1023 [qemu-img] > 9,0 11 2 0.000007938 11151 Q WS 312738815 + 8 [qemu-img] > 9,0 11 3 0.000030735 11151 Q WS 312738823 + 1016 [qemu-img] > 9,0 11 4 0.000032482 11151 Q WS 312739839 + 8 [qemu-img] > 9,0 11 5 0.000041379 11151 Q WS 312739847 + 1016 [qemu-img] > 9,0 11 6 0.000042818 11151 Q WS 312740863 + 8 [qemu-img] > 9,0 11 7 0.000051236 11151 Q WS 312740871 + 1017 [qemu-img] > 9,0 5 1 0.169071519 11151 Q WS 312741888 + 1023 [qemu-img] > After the patch the pattern becomes normal: > 9,0 6 1 0.000000000 12422 Q WS 314834944 + 1024 [qemu-img] > 9,0 6 2 0.000038527 12422 Q WS 314835968 + 1024 [qemu-img] > 9,0 6 3 0.000072849 12422 Q WS 314836992 + 1024 [qemu-img] > 9,0 6 4 0.000106276 12422 Q WS 314838016 + 1024 [qemu-img] > and the amount of requests sent to disk (could be calculated counting > number of lines in the output of blktrace) is reduced about 2 times. > > Both qemu-img and qemu-io are affected while qemu-kvm is not. The guest > does his job well and real requests comes properly aligned (to page). > > Changes from v7: > - make assignment from v6 unconditional (Kevin) > > Changes from v6: > - explicitely assign opt_mem_alignemnt in raw-posix.c with > MAX(s->buf_align, getpagesize()) (Kevin) > > Changes from v5: > - found justification from kernel point of view > - fixed checkpatch warnings in the patch 2 > > Changes from v4: > - patches reordered > - dropped conversion from 512 to BDRV_SECTOR_SIZE > - getpagesize() is replaced with MAX(4096, getpagesize()) as suggested by > Kevin > > Changes from v3: > - portable way to calculate system page size used > - 512/4096 values are replaced with proper macros/values > > Changes from v2: > - opt_mem_alignment is split to opt_mem_alignment for bounce buffering > and min_mem_alignment to check buffers coming from guest. > > Changes from v1: > - enforces 4096 alignment in qemu_(try_)blockalign, avoid touching of > bdrv_qiov_is_aligned path not to enforce additional bounce buffering > as suggested by Paolo > - reduces 10% to 5% in patch description to better fit 180 vs 189 > difference > > Signed-off-by: Denis V. Lunev <d...@openvz.org> > CC: Paolo Bonzini <pbonz...@redhat.com> > CC: Kevin Wolf <kw...@redhat.com> > CC: Stefan Hajnoczi <stefa...@redhat.com> > >
Thanks, applied to my block tree: https://github.com/stefanha/qemu/commits/block Stefan
pgp7c0MWrTrmX.pgp
Description: PGP signature