On Fri, Jun 1, 2012 at 4:06 PM, Stefan Hajnoczi <stefa...@gmail.com> wrote: > On Fri, Jun 1, 2012 at 6:22 AM, Zhi Yong Wu <zwu.ker...@gmail.com> wrote: >> On Thu, May 31, 2012 at 5:26 PM, Stefan Hajnoczi <stefa...@gmail.com> wrote: >>> On Wed, May 30, 2012 at 9:31 AM, Zhi Yong Wu <zwu.ker...@gmail.com> wrote: >>>> On Sat, May 12, 2012 at 12:48 AM, Kevin Wolf <kw...@redhat.com> wrote: >>>>> A prerequisite for a "QED mode" in qcow2, which doesn't update the >>>>> refcount >>>> Recently some new concepts such as "QED mode" in qcow2 are seen >>>> frequencely, can anyone explain what it means? thanks. >>> >>> qcow2 has more metadata than qed. More metadata means more write >>> operations when allocating new clusters. >>> >>> In order to overcome this performance issue qcow2 has a metadata >>> cache. But when QEMU is launched with -drive ...,cache=writethrough >>> (the default) the metadata cache *must* be in writethrough mode >> Why must i be? If the option with -drive ..,cache=writethrough is >> specified. it means that host page cache is on while guest disk cache >> is off. Since the metadata cache exists in host page cache, not guest, >> i think that it is in writeback mode. > > Since the emulated disk write cache is off, we must ensure that guest > writes are on disk before completing them. Therefore we cannot cache > metadata updates in host RAM - it would be lost on power failure but But host page cache is *on* in this mode, which means that metadata should be cached in host RAM. how do you explain this?
> we promised the guest its writes reached the disk! > >>> instead of writeback mode. In other words, every metadata update >>> needs to be written to the image file before we complete the guest's >> What will mean one guest's wirte request is completed? > > For example, virtio-blk fills in the success status code and raises an > interrupt. This notifies the guest that the write is done. Great, thanks. > >>> write request. This means the metadata cache only hides the metadata >>> performance issue when -drive ...,cache=direct|writeback are used >>> because there we can keep metadata changes buffered in memory until >>> the guest flushes the emulated disk write cache. >>> >>> "QED mode" is a solution for -drive ...,cache=writethrough|directsync. >>> It simply doesn't update refcount metadata in the qcow2 image file l1/l2 info need to be updated to qcow2 image file? >>> immediately in exchange for a refcount fixup step that is introduced >> Can you say this with more details? Why is this step need only when >> image file is opened? After image file is opened, and some guest's >> write requests are completed, maybe the refcount fixup step need to be >> done once. > > If we don't update refcounts on disk then they become outdated and no > longer reflect the true allocation information. It's not safe to rely > on outdated refcount information since we could allocate the same > cluster multiple times - this means data corruption. By running a > consistency check when opening a dirty image file we guarantee that we > have accurate refcount information again. ah, i got it now. > > As an optimization we will commit refcount information to disk when > closing the image file and mark it clean. This means a clean QEMU > shutdown does not require a consistency check on startup - but in the > worst case (power failure or crash) we will have a dirty image file. Yeah, a consistency check on startup is good, i think. thanks. > > Stefan -- Regards, Zhi Yong Wu