On 10/18/2012 03:51 AM, Dong Xu Wang wrote:
> Document for add-cow format, the usage and spec of add-cow are introduced.
> 
> Signed-off-by: Dong Xu Wang <wdon...@linux.vnet.ibm.com>
> ---
>  docs/specs/add-cow.txt |  139 
> ++++++++++++++++++++++++++++++++++++++++++++++++
>  1 files changed, 139 insertions(+), 0 deletions(-)
>  create mode 100644 docs/specs/add-cow.txt
> 
> diff --git a/docs/specs/add-cow.txt b/docs/specs/add-cow.txt
> new file mode 100644
> index 0000000..dc1e107
> --- /dev/null
> +++ b/docs/specs/add-cow.txt
> @@ -0,0 +1,139 @@
> +== General ==
> +
> +The raw file format does not support backing files or copy on write feature.
> +The add-cow image format makes it possible to use backing files with raw

s/with raw/with a raw/

> +image by keeping a separate .add-cow metadata file. Once all sectors
> +have been written into the raw image it is safe to discard the .add-cow
> +and backing files, then we can use the raw image directly.
> +
> +An example usage of add-cow would look like::
> +(ubuntu.img is a disk image which has been installed OS.)

s/has been installed/has an installed/

> +    1)  Create a raw image with the same size of ubuntu.img
> +            qemu-img create -f raw test.raw 8G
> +    2)  Create an add-cow image which will store dirty bitmap
> +            qemu-img create -f add-cow test.add-cow \
> +                -o backing_file=ubuntu.img,image_file=test.raw
> +    3)  Run qemu with add-cow image
> +            qemu -drive if=virtio,file=test.add-cow
> +
> +test.raw may be larger than ubuntu.img, in that case, the size of 
> test.add-cow
> +will be calculated from the size of test.raw.
> +
> +=Specification=
> +
> +The file format looks like this:
> +
> + +---------------+-------------+-----------------+
> + |     Header    |   Reserved  |    COW bitmap   |
> + +---------------+-------------+-----------------+

Since you call out what all 4096 bytes must be in the Header section
(all bytes not occupied by a backing file name or format must be NUL),
do your really need Reserved in this section, or can you just claim that
the 4096-byte header is directly followed by the COW bitmap?

> +
> +All numbers in add-cow are stored in Little Endian byte order.
> +
> +== Header ==
> +
> +The Header is included in the first bytes:
> +(#define HEADER_SIZE (4096 * header_size))
> +    Byte    0 -  7:     magic
> +                        add-cow magic string ("ADD_COW\xff").
> +
> +            8 -  11:    version
> +                        Version number (only valid value is 1 now).
> +
> +            12 - 15:    backing file name offset
> +                        Offset in the add-cow file at which the backing file
> +                        name is stored (NB: The string is not 
> nul-terminated).
> +                        If backing file name does NOT exist, this field will 
> be
> +                        0. Must be between 80 and [HEADER_SIZE - 2](a file 
> name
> +                        must be at least 1 byte).
> +
> +            16 - 19:    backing file name size
> +                        Length of the backing file name in bytes. It will be > 0
> +                        if the backing file name offset is 0. If backing file
> +                        name offset is non-zero, then it must be non-zero. 
> Must
> +                        be less than [HEADER_SIZE - 80] to fit in the 
> reserved
> +                        part of the header.

More specifically, it must be small enough so that offset+size <=
HEADER_SIZE.

> +
> +            20 - 23:    image file name offset
> +                        Offset in the add-cow file at which the image file 
> name
> +                        is stored (NB: The string is not null terminated). It
> +                        must be between 80 and [HEADER_SIZE - 2].
> +
> +            24 - 27:    image file name size
> +                        Length of the image file name in bytes.
> +                        Must be less than [HEADER_SIZE - 80] to fit in the 
> reserved
> +                        part of the header.

More specifically, it must be small enough so that offset+size <=
HEADER_SIZE.

> +
> +            28 - 31:    cluster bits
> +                        Number of bits that are used for addressing an offset
> +                        within a cluster (1 << cluster_bits is the cluster 
> size).
> +                        Must not be less than 9 (i.e. 512 byte clusters).
> +
> +                        Note: qemu as of today has an implementation limit 
> of 2 MB
> +                        as the maximum cluster size and won't be able to 
> open images
> +                        with larger cluster sizes.
> +
> +            32 - 39:    features
> +                        Bitmask of features. An implementation can safely 
> ignore
> +                        any unknown bits that are set.

Really?  That sounds more like optional features, if an implementation
can ignore a set bit.  You really want to require that implementations
reject operations on a file with a feature bit set that they don't
recognize.

> +
> +                        Bit 0:      All allocated bit.  If this bit is set 
> then
> +                                    backing file and COW bitmap will not be 
> used,
> +                                    and can read from or write to image file 
> directly.

And this particular bit sounds like an optional feature - setting the
bit is an optimization in speed, but leaving the bit clear or ignoring
the bit when it is set does not change correctness.

> +
> +                        Bits 1-63:  Reserved (set to 0)
> +
> +            40 - 47:    optional features
> +                        Not used now. Reserved for future use. It must be 
> set to 0.
> +                        And must be ignored while reading.

s/. And must be ignored/, and ignored/

> +
> +            48 - 51:    header size
> +                        The header field is variable-sized. This field 
> indicates
> +                        how many 4096 bytes will be used to store add-cow 
> header.
> +                        In add-cow v1, it is fixed to 1, so the header size 
> will
> +                        be 4096 * 1 = 4096 bytes.

So is the value '1' or '4096' in this field?  The wording isn't quite
clear.  But reading elsewhere, it looks like this should always be '1'
in version one add-cow.

> +
> +            52 - 67:    backing file format
> +                        Format of backing file. It will be filled with 0 if
> +                        backing file name offset is 0. If backing file name
> +                        offset is non-empty, it must be non-empty.

Are you going to enforce this?  Normally, if backing file format is
omitted, then qemu knows how to probe backing file format (with the
caveat that it is a security risk if the probe returns non-raw but the
backing file really was raw).

> It is coded
> +                        in free-form ASCII, and is not NUL-terminated. Zero
> +                        padded on the right.
> +
> +            68 - 83:    image file format
> +                        Format of image file. It must be non-empty. It is 
> coded
> +                        in free-form ASCII, and is not NUL-terminated. Zero
> +                        padded on the right.

Why do we need this field?  Isn't the image file ALWAYS raw?

> +
> +            84 - [HEADER_SIZE - 1]:

Elsewhere in your spec, you use 80 rather than 84 for the starting point
of valid offsets.  Which is it?

> +                        It is used to make sure COW bitmap field starts at 
> the
> +                        HEADER_SIZE byte, backing file name and image file 
> name
> +                        will be stored here. The bytes that is not pointing 
> to
> +                        backing file and image file names must be set to 0.

Not just file names, but also backing file format.

> +
> +== COW bitmap ==
> +
> +The "COW bitmap" field starts at offset HEADER_SIZE, stores a bitmap related 
> to

Shouldn't that be HEADER_SIZE * number of header pages, since you
dedicated a field in the header for that purpose?

> +backing file and image file. The bitmap will track whether the sector in
> +backing file is dirty or not.
> +
> +Each bit in the bitmap tracks one cluster's status. For example, if cluster
> +bit is 16, then each bit tracks one cluster, (1 >> 16) = 65536 bytes. The 
> size

s/>>/<</

> +of bitmap is calculated according to virtual size of image file, and it must
> +be multiple of 65536, the bits not used will be set to 0. Within each byte,

What must be a multiple of 65536, the image file, or the size of the
bitmap in the add-cow file?  I think what you want is:

The image size is rounded up to cluster size (where any bytes in the
last cluster that do not fit in the image are ignored), then if the
number of clusters is not a multiple of 8, then remaining bits in the
bitmap will be set to 0.

Or do you really want to require that the bitmap is a multiple of 64k
bytes (at 8 bits per byte, that means the bitmap covers a multiple of
512k clusters, and at 512 bytes as the minimum cluster size, that the
add-cow file format manages a minimum of 256M)?  That is, are you
requiring that the bitmap end on an aligned boundary, to make the bitmap
easier to use without having to special case a short-read on the last
page of the bitmap?

> +the least significant bit covers the first cluster. Bit orders in one
> +byte look like:
> + +----+----+----+----+----+----+----+----+
> + | b7 | b6 | b5 | b4 | b3 | b2 | b1 | b0 |
> + +----+----+----+----+----+----+----+----+
> +
> +If the bit is 0, indicates the sector has not been allocated in image file, 
> data
> +should be loaded from backing file while reading; if the bit is 1, indicates 
> the

s/indicates/it indicates/ (twice)

If there is no backing file, or if the image file is larger than the
backing file and the offset is beyond the end of the backing file, then
the data should be read as all zero bytes instead.

> +related sector has been dirty, should be loaded from image file while 
> reading.
> +Writing to a sector causes the corresponding bit to be set to 1.
> +
> +If raw image is not an even multiple of cluster bytes, bits that correspond 
> to
> +bytes beyond the raw file size in add-cow must be written as 0 and must be
> +ignored when reading.
> +
> +Image file name and backing file name must NOT be the same, we prevent this
> +while creating add-cow files.

You prevent it when creating add-cow files via qemu-img, but that
doesn't stop malicious users from creating a file with those properties
and then trying to get you to parse it as add-cow.  I think this needs
to instead be a requirement on the consumer of a potentially bad file,
and not a requirement on the producer to avoid bad files, since you
can't control all producers, as in:
If image file name and backing file resolve to the same file, the
add-cow image must be treated as invalid.

-- 
Eric Blake   ebl...@redhat.com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org

Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to