At Fri, 22 Oct 2010 10:47:44 +0200, Kevin Wolf wrote: > > Am 22.10.2010 07:43, schrieb MORITA Kazutaka: > > At Thu, 21 Oct 2010 16:07:28 +0200, > > Kevin Wolf wrote: > >> > >> Hi all, > >> > >> I'm currently looking into adding a return value to qemu's bdrv_flush > >> function and I noticed that your block drivers (nbd, rbd and sheepdog) > >> don't implement bdrv_flush at all. bdrv_flush is going to return > >> -ENOTSUP for any block driver not implementing this, effectively > >> breaking these three drivers for anything but cache=unsafe. > >> > >> Is there a specific reason why your drivers don't implement this? I > >> think I remember that one of the drivers always provides > >> cache=writethough semantics. It would be okay to silently "upgrade" to > >> cache=writethrough, so in this case I'd just need to add an empty > >> bdrv_flush implementation. > >> > >> Otherwise, we really cannot allow any option except cache=unsafe because > >> that's the semantics provided by the driver. > >> > >> In any case, I think it would be a good idea to implement a real > >> bdrv_flush function to allow the write-back cache modes cache=off and > >> cache=writeback in order to improve performance over writethrough. > >> > >> Is this possible with your protocols, or can the protocol be changed to > >> consider this? Any hints on how to proceed? > >> > > > > It is a bit difficult to implement an effective bdrv_flush in the > > sheepdog block driver. Sheepdog virtual disks are splited and > > distributed to all cluster servers, so the block driver needs to send > > flush requests to all of them. I'm not sure this could improve > > performance more than writethrough semantics. > > It could probably be optimized so that you only send flush requests to > servers that have actually received write requests since the last flush. > > But yes, that's probably a valid point. I guess there's only one way to > find out how it performs: Trying it out.
Agreed, I'll try it out. > > > So I think it is better to support only writethrough semantics > > currently (I'll modify sheepdog server codes to open stored objects > > with O_SYNC or O_DIRECT) and leave write-back semantics as a future > > work. > > I agree, that makes sense. > > Note that O_DIRECT does not provide write-through semantics. It bypasses > the page cache, but it doesn't flush other caches like a volatile disk > write cache. If you want to use it, you still need explicit flushes or > O_DIRECT | O_SYNC. Thanks for your comment. I've modified server codes to use O_SYNC, so now sheepdog gives cache=writethrough semantics always. Kazutaka