On Fri, Sep 09, 2011 at 04:04:07PM +0200, Kevin Wolf wrote: > Am 09.09.2011 15:54, schrieb Stefan Hajnoczi: > > On Fri, Sep 9, 2011 at 2:48 PM, Zhi Yong Wu <zwu.ker...@gmail.com> wrote: > >> On Fri, Sep 9, 2011 at 6:38 PM, Stefan Hajnoczi > >> <stefa...@linux.vnet.ibm.com> wrote: > >>> On Fri, Sep 09, 2011 at 05:44:36PM +0800, Zhi Yong Wu wrote: > >>>> Today, i did some basical I/O testing, and suddenly found that qemu > >>>> write and rw speed is so low now, my qemu binary is built on commit > >>>> 344eecf6995f4a0ad1d887cec922f6806f91a3f8. > >>>> > >>>> Do qemu have regression? > >>>> > >>>> The testing data is shown as below: > >>>> > >>>> 1.) write > >>>> > >>>> test: (g=0): rw=write, bs=512-512/512-512, ioengine=libaio, iodepth=1 > >>> > >>> Please post your QEMU command-line. If your -drive is using > >>> cache=writethrough then small writes are slow because they require the > >>> physical disk to write and then synchronize its write cache. Typically > >>> cache=none is a good setting to use for local disks. > >> Now i can not access my workstation in the office. > >> -drive if=virtio,cache=none,file=xxxx > >> > >>> > >>> The block size of 512 bytes is too small. Ext4 uses a 4 KB block size, > >>> so I think a 512 byte write from the guest could cause a 4 KB > >>> read-modify-write operation on the host filesystem. > >> You mean RCU? What is its work procedure? Can you explain in more > >> details if you are available? > > > > If the host file system manages space in 4 KB blocks, then a 512 byte > > to an unallocated part of the file causes the file system to find 4 KB > > of free space for this data. Since the write is only 512 bytes and > > does not cover the entire 4 KB region, the file system initializes the > > remaining 3.5 KB with zeros and writes out the full 4 KB block. > > > > Now if a 512 byte write comes in for an allocated 4 KB block, then we > > need to read in the existing 4 KB, modify the 512 bytes in place, and > > write out the 4 KB block again. This is read-modify-write. In this > > worst-case scenario a 512 byte write turns into a 4 KB read followed > > by a 4 KB write. > > But that should only happen with a 4k sector size, otherwise there's no > reason for RMW.
You're right. For cache=none (O_DIRECT), the host file system should not need to do read-modify-write because it can write the single sector without caring what is in the surrounding 3.5 KB. Stefan