> >> Moreover, using a host file system not only adds overhead, but > >> also introduces data integrity issues. Specifically, if I/Os uses O_DSYNC, > >> it may be too slow. If I/Os use O_DIRECT, it cannot guarantee data > >> integrity in the event of a host crash. See > >> http://lwn.net/Articles/348739/ . > > > > You have the same issue with O_DIRECT when using a raw disk device > > too. That is, O_DIRECT on a raw device does not guarantee integrity > > in the event of a host crash either, for mostly the same reasons. > > QEMU has semantics that use O_DIRECT safely; there is no issue here. > When a drive is added with cache=none, QEMU not only uses O_DIRECT but > also advertises an enabled write cache to the guest. > > The guest *must* flush the cache when it wants to ensure data is > stable. In the event of a host crash, all, some, or none of the I/O > since the last flush may have made it to disk. Each of these > possibilities is fair game since the guest may only depend on writes > being on disk if they completed and a successful flush was issued > afterwards.
Thank both of you for the explanation, which is very helpful to me. With FVD's capability of eliminating the host file system and storing the image on a logical volume, then perhaps we can always use O_DSYNC, because there is little (or no?) LVM metadata that needs a flush on every write and hence O_DSYNC does not add overhead? I am not certain on this, and need help for confirmation. If this is true, the guest does not need to flush the cache.