QEMU Enhanced Disk format is a disk image format that forgoes features found in qcow2 in favor of better levels of performance and data integrity. Due to its simpler on-disk layout, it is possible to safely perform metadata updates more efficiently.
Installations, suspend-to-disk, and other allocation-heavy I/O workloads will see increased performance due to fewer I/Os and syncs. Workloads that do not cause new clusters to be allocated will perform similar to raw images due to in-memory metadata caching. The format supports sparse disk images. It does not rely on the host filesystem holes feature, making it a good choice for sparse disk images that need to be transferred over channels where holes are not supported. Backing files are supported so only deltas against a base image can be stored. The base image may be smaller than the image file. The file format is extensible so that additional features can be added later with graceful compatibility handling. A specification for the file format is included in this patchset. Internal snapshots are not supported. This eliminates the need for additional metadata to track copy-on-write clusters. Compression and encryption are not supported. They add complexity and can be implemented at other layers in the stack (i.e. inside the guest or on the host). Encryption has been identified as a potential future extension and the file format allows for this. Signed-off-by: Anthony Liguori <aligu...@us.ibm.com> Signed-off-by: Stefan Hajnoczi <stefa...@linux.vnet.ibm.com> --- This code is also available from git: http://repo.or.cz/w/qemu/stefanha.git/shortlog/refs/heads/qed-v4 I have preserved distinct commits against v3 for easier reviewing here: http://repo.or.cz/w/qemu/stefanha.git/shortlog/refs/heads/qed-v4-presquash v4: * Use bdrv_*() instead of POSIX APIs to create an image file * Lift the non-zero image size restriction * Fix qed.c/qed.h style comments from Kevin v3: * Flush before L2 update when a backing file is used * Use QED_F_BACKING_FORMAT_NO_PROBE instead of backing_fmt header field * Allow non-cluster sized images * Implement autoclear feature bits * Implement backing image smaller size - reads from backing image should zero beyond EOF * Preserve errno in qed_find_cluster_cb() - don't dumb down to QED_CLUSTER_ERROR * Use ffs() instead of get_bits_from_size() * Remove l2_cache argument to qed_unref_l2_cache_entry * Eliminate L2TableAllocFunc function pointer * Split qed_aio_write in-place and allocating code path to make code clearer * Document how L2 cache is used * Document qed_find_cluster() * Update QED specification * Fix COPYING.LIB LGPL license file references * Add copyright header to qed-check.c * Avoid the bytes_to_str()/cvtstr()/sztostr() dependency until Jes' strtosz() goes in v2: * Add QED format specification to documentation * Use __builtin_ctzl() for get_bits_from_size() * Fine-grained table locking to allow concurrent allocating write requests * Fix qemu_free() instead of qemu_vfree() in qed_unref_l2_cache_entry() * Comment clean-ups Makefile.objs | 2 + block/qed-check.c | 210 +++++++ block/qed-cluster.c | 154 ++++++ block/qed-gencb.c | 32 ++ block/qed-l2-cache.c | 173 ++++++ block/qed-lock.c | 124 +++++ block/qed-table.c | 317 +++++++++++ block/qed.c | 1382 +++++++++++++++++++++++++++++++++++++++++++++++ block/qed.h | 315 +++++++++++ block_int.h | 1 + docs/specs/qed_spec.txt | 128 +++++ trace-events | 21 + 12 files changed, 2859 insertions(+), 0 deletions(-)