Re: [libvirt] PATCH: Disable QEMU drive caching

Anthony Liguori Thu, 09 Oct 2008 06:40:22 -0700

Mark McLoughlin wrote:

i.e. with write-back caching enabled, the IDE  protocol makes no
guarantees about when data is committed to disk.


So, from a protocol correctness POV, qemu is behaving correctly with
cache=on and write-back caching enabled on the disk.


Yes, the host page cache is basically a big on-disk cache.

For SCSI, an unordered queue is advertised. Again, everything dependson whether or not write-back caching is enabled or not. Again,perfectly happy to take patches here.
Queue ordering and write-back caching sound like very different things.
Are they two distinct SCSI options, or ...?


Yes.

Surely an ordered queue doesn't do much help prevent fs corruption if
the host crashes, right? You would still need write-back caching
disabled?

You need both. In theory, a guest would use queue ordering to guaranteethat certain writes made it to disk before other writes. Enablingwrite-through guarantees that the data is actually on disk. Since weadvertise an unordered queue, we're okay from a safety point-of-view butfor performance reasons, we'll want to do ordered queuing.

More importantly, the most common journaled filesystem, ext3, does notenable write barriers by default (even for journal updates). This ishow it ship in Red Hat distros.
i.e. implementing barriers for virtio won't help most ext3 deployments?

Yes, ext3 doesn't use barriers by default. Seehttp://kerneltrap.org/mailarchive/linux-kernel/2008/5/19/1865314

And again, if barriers are just about ordering, don't you need to
disable caching anyway?

Well, virtio doesn't have a notion of write-caching. My thinking isthat we ought to implement barriers via fdatasync because posix-aioalready has an op for it. This would effectively use barriers as apoint to force something on disk. I think this would take care of mostof the data corruption issues since the cases where the guest caresabout data corruption would be handled (barriers should be used forjournal writes and any O_DIRECT write for instance, although, yeah, notthe case today with ext3).

So there is no greater risk of corrupting a journal in QEMU than there
is on bare metal.


This is the bit I really don't buy - we're equating qemu caching to IDE
write-back caching and saying the risk of corruption is the same in both
cases.


Yes.

But doesn't qemu cache data for far, far longer than a typical IDE disk
with write-back caching would do? Doesn't that mean you're far, far more
likely to see fs corruption with qemu caching?

It caches more data, I don't know how much longer it cases than atypical IDE disk. The guest can crash and that won't cause data loss.The only thing that will really cause data loss is the host crashing soit's slightly better than write-back caching from that regard.

Or put it another way, if we fix it by implementing the disabling of
write-back caching ... users running a virtual machine will need to run
"hdparam -W 0 /dev/sda" where they would never have run it on baremetal?

I don't see it as something needing to be fixed because I don't see thatthe exposure is significantly greater for a VM than for a real machine.

And let's take a step back too. If people are really concerned aboutthis point, let's introduce a sync=on option that opens the image withO_SYNC. This will effectively make the cache write-through without thebaggage associated with O_DIRECT.

While I object to libvirt always setting cache=off, I think sync=on forIDE and SCSI may be reasonable (you don't want it for virtio-blk once weimplement proper barriers with fdatasync I think).


Regards,

Anthony Liguori

Cheers,
Mark.


--
Libvir-list mailing list
Libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list

Re: [libvirt] PATCH: Disable QEMU drive caching

Reply via email to