On 01/11/2013 03:52 PM, MORITA Kazutaka wrote: > At Thu, 10 Jan 2013 13:38:16 +0800, > Liu Yuan wrote: >> >> On 01/09/2013 11:10 PM, Paolo Bonzini wrote: >>> Il 09/01/2013 14:04, Liu Yuan ha scritto: >>>>>> 2 The upper layer software which relies on the 'cache=xxx' to choose >>>>>> cache mode will fail its assumption against new QEMU. >>>>> >>>>> Which assumptions do you mean? As far as I can say the behaviour hasn't >>>>> changed, except possibly for the performance. >>>> >>>> When users set 'cache=writethrough' to export only a writethrough cache >>>> to Guest, but with new QEMU, it will actually get a writeback cache as >>>> default. >>> >>> They get a writeback cache implementation-wise, but they get a >>> writethrough cache safety-wise. How the cache is implemented doesn't >>> matter, as long as it "looks like" a writethrough cache. >>> >> >>> In fact, consider a local disk that doesn't support FUA. In old QEMU, >>> images used to be opened with O_DSYNC and that splits each write into >>> WRITE+FLUSH, just like new QEMU. All that changes is _where_ the >>> flushes are created. Old QEMU changes it in the kernel, new QEMU >>> changes it in userspace. >>> >>>> We don't need to communicate to the guest. I think 'cache=xxx' means >>>> what kind of cache the users *expect* to export to Guest OS. So if >>>> cache=writethrough set, Guest OS couldn't turn it to writeback cache >>>> magically. This is like I bought a disk with 'writethrough' cache >>>> built-in, I didn't expect that it turned to be a disk with writeback >>>> cache under the hood which could possible lose data when power outage >>>> happened. >>> >>> It's not by magic. It's by explicitly requesting the disk to do this. >>> >>> Perhaps it's a bug that the cache mode is not reset when the machine is >>> reset. I haven't checked that, but it would be a valid complaint. >>> >> >> Ah I didn't get the current implementation right. I tried the 3.7 kernel >> and it works as expected (cache=writethrough result in a 'writethrough' >> cache in the guest). >> >> It looks fine to me to emulate writethrough as writeback + flush, since >> the profermance drop isn't big, though sheepdog itself support true >> writethrough cache (no flush). > > Can we drop the SD_FLAG_CMD_CACHE flag from sheepdog write requests > when bdrv_enable_write_cache() is false? Then the requests behave > like FUA writes and we can safely omit succeeding flush requests. >
Let's first implement a emulated writethroughc cache and then look for a methond if we can play tricks to implement a true writethrough cache. This would bring complexity such as before we drop SD_FLAG_CMD_CACHE, we need to flush beforehand. And more, bdrv_enable_write_cache() is always true for now, I guess this need generic change block layer to get the writethrough/writeback hints. Thanks, Yuan