On Fri, 22 Oct 2010, Kevin Wolf wrote: > [ Adding qemu-devel to CC again ] > > Am 21.10.2010 20:59, schrieb Sage Weil: > > On Thu, 21 Oct 2010, Christian Brunner wrote: > >> Hi, > >> > >> is there a flush operation in librados? - I guess the only way to > >> handle this, would be waiting until all aio requests are finished? > > That's not the semantics of bdrv_flush, you don't need to wait for > running requests. You just need to make sure that all completed requests > are safe on disk so that they would persist even in case of a > crash/power failure.
Okay, in that case we're fine. librados doesn't declare a write committed until it is safely on disk on multiple backend nodes. There is a mechanism to get an ack sooner, but the qemu storage driver does not use it. > > There is no flush currently. But librados does no caching, so in this > > case at least silenting upgrading to cache=writethrough should work. > > You're making sure that the data can't be cached in the server's page > cache or volatile disk cache either, e.g. by using O_SYNC for the image > file? If so, upgrading would be safe. Right. > > If that's a problem, we can implement a flush. Just let us know. > > Presumably providing a writeback mode with explicit flushes could > improve performance. Upgrading to writethrough is not a correctness > problem, though, so it's your decision if you want to implement it. So is a bdrv_flush generated when e.g. the guest filesystem issues a barrier, or would otherwise normally ask a SATA disk to flush it's cache? sage > Kevin > > >> ---------- Forwarded message ---------- > >> From: Kevin Wolf <kw...@redhat.com> > >> Date: 2010/10/21 > >> Subject: [Qemu-devel] bdrv_flush for qemu block drivers nbd, rbd and > >> sheepdog > >> To: Christian Brunner <c...@muc.de>, Laurent Vivier > >> <laur...@vivier.eu>, MORITA Kazutaka <morita.kazut...@lab.ntt.co.jp> > >> Cc: Qemu-devel@nongnu.org > >> > >> > >> Hi all, > >> > >> I'm currently looking into adding a return value to qemu's bdrv_flush > >> function and I noticed that your block drivers (nbd, rbd and sheepdog) > >> don't implement bdrv_flush at all. bdrv_flush is going to return > >> -ENOTSUP for any block driver not implementing this, effectively > >> breaking these three drivers for anything but cache=unsafe. > >> > >> Is there a specific reason why your drivers don't implement this? I > >> think I remember that one of the drivers always provides > >> cache=writethough semantics. It would be okay to silently "upgrade" to > >> cache=writethrough, so in this case I'd just need to add an empty > >> bdrv_flush implementation. > >> > >> Otherwise, we really cannot allow any option except cache=unsafe because > >> that's the semantics provided by the driver. > >> > >> In any case, I think it would be a good idea to implement a real > >> bdrv_flush function to allow the write-back cache modes cache=off and > >> cache=writeback in order to improve performance over writethrough. > >> > >> Is this possible with your protocols, or can the protocol be changed to > >> consider this? Any hints on how to proceed? > >> > >> Kevin > >> -- > >> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in > >> the body of a message to majord...@vger.kernel.org > >> More majordomo info at http://vger.kernel.org/majordomo-info.html > >> > >> > -- > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > >