Am 08.09.2016 um 22:46 hat ashish mittal geschrieben: > Hi Kevin, > > By design, our writeback cache is on non-volatile SSD device. We do > async writes to this cache and also maintain a persistent index map of > the data written. This gives us the capability to recover write-back > cache if needed.
So your server application uses something like O_SYNC to write data to the cache SSD, in order to avoid that data is sitting only in the kernel page cache or in a volatile write cache of the SSD? Kevin > On Thu, Sep 8, 2016 at 7:20 AM, Kevin Wolf <kw...@redhat.com> wrote: > > Am 08.09.2016 um 16:00 hat Jeff Cody geschrieben: > >> > >> +/* > >> > >> + * This is called by QEMU when a flush gets triggered from within > >> > >> + * a guest at the block layer, either for IDE or SCSI disks. > >> > >> + */ > >> > >> +int vxhs_co_flush(BlockDriverState *bs) > >> > >> +{ > >> > >> + BDRVVXHSState *s = bs->opaque; > >> > >> + uint64_t size = 0; > >> > >> + int ret = 0; > >> > >> + > >> > >> + ret = qemu_iio_ioctl(s->qnio_ctx, > >> > >> + s->vdisk_hostinfo[s->vdisk_cur_host_idx].vdisk_rfd, > >> > >> + VDISK_AIO_FLUSH, &size, NULL, IIO_FLAG_SYNC); > >> > >> + > >> > >> + if (ret < 0) { > >> > >> + /* > >> > >> + * Currently not handling the flush ioctl > >> > >> + * failure because of network connection > >> > >> + * disconnect. Since all the writes are > >> > >> + * commited into persistent storage hence > >> > >> + * this flush call is noop and we can safely > >> > >> + * return success status to the caller. > >> > > > >> > > I'm not sure I understand here. Are you saying the qemu_iio_ioctl() > >> > > call > >> > > above is a noop? > >> > > > >> > > >> > Yes, qemu_iio_ioctl(VDISK_AIO_FLUSH) is only a place-holder at present > >> > in case we later want to add some functionality to it. I have now > >> > added a comment to this affect to avoid any confusion. > >> > > >> > >> The problem is you don't know which version of the qnio library any given > >> QEMU binary will be using, since it is a shared library. Future versions > >> may implement the flush ioctl as expressed above, in which case we may hide > >> a valid error. > >> > >> Am I correct in assuming that this call suppresses errors because an error > >> is returned for an unknown ioctl operation of VDISK_AIO_FLUSH? If so, and > >> you want a placeholder here for flushing, you should go all the way and > >> stub > >> out the underlying ioctl call to return success. Then QEMU can at least > >> rely on the error return from the flush operation. > > > > So what's the story behind the missing flush command? > > > > Does the server always use something like O_SYNC, i.e. all potential > > write caches in the stack operate in a writethrough mode? So each write > > request is only completed successfully if it is ensured that the data is > > safe on disk rather than in a volatile writeback cache? > > > > As soon as any writeback cache can be involved (e.g. the kernel page > > cache or a volatile disk cache) and there is no flush command (a real > > one, not just stubbed), the driver is not operating correctly and > > therefore not ready for inclusion. > > > > So Ashish, can you tell us something about caching behaviour across the > > storage stack when vxhs is involved? > > > > Kevin