On 12/22/2015 09:46 AM, Kevin Wolf wrote:
> People have been keeping destroying their qcow2 images by using

Any of these sound better:

People have kept on destroying
People have been destroying
People keep on destroying

> 'qemu-img snapshot' on images that were in use by a VM. This patch adds
> some image locking that protects them against this mistake.
> 
> In order to achieve this, a new compatible header flag is introduced
> that tells that the image is currently in use. It is (almost) always set
> when qcow2 considers the image to be active and in a read-write mode.
> During live migration, the source considers the image active until the
> VM stops on migration completion. The destination considers it active as
> soon as it starts running the VM.
> 
> In cases where qemu wasn't shut down cleanly, images may incorrectly
> refuse to open. An option override-lock=on is provided to force opening
> the image (this is the option that qemu-img uses for 'force-unlock' and
> 'check --force').
> 
> A few test cases have to be adjusted, either to update error messages,
> to use read-only mode to avoid the check, or to override the lock where
> necessary.
> 
> Signed-off-by: Kevin Wolf <kw...@redhat.com>
> ---

> diff --git a/block/qcow2.c b/block/qcow2.c
> index 544c124..c07a078 100644
> --- a/block/qcow2.c
> +++ b/block/qcow2.c
> @@ -307,6 +307,8 @@ int qcow2_mark_corrupt(BlockDriverState *bs)
>      BDRVQcow2State *s = bs->opaque;
>  
>      s->incompatible_features |= QCOW2_INCOMPAT_CORRUPT;
> +    s->compatible_features &= ~QCOW2_COMPAT_IN_USE;
> +

So the moment we detect something is wrong, we (attempt to) write the
corrupt bit but promise to do no further writes, so it makes sense that
we can claim we are no longer using the image.

>  
> @@ -472,6 +474,11 @@ static QemuOptsList qcow2_runtime_opts = {
>              .type = QEMU_OPT_NUMBER,
>              .help = "Clean unused cache entries after this time (in 
> seconds)",
>          },
> +        {
> +            .name = BDRV_OPT_OVERRIDE_LOCK,
> +            .type = QEMU_OPT_BOOL,
> +            .help = "Open the image read-write even if it is locked",
> +        },

Missing counterpart documentation in qapi/block-core.json
BlockdevOptionsQcow2.

> +    /* Protect against opening the image r/w twice at the same time */
> +    if (!bs->read_only && (s->compatible_features & QCOW2_COMPAT_IN_USE)) {
> +        /* Shared storage is expected during migration */
> +        bool migrating = (flags & BDRV_O_INCOMING);
> +
> +        if (!migrating && !s->override_lock) {
> +            error_set(errp, ERROR_CLASS_IMAGE_FILE_LOCKED,
> +                      "Image is already in use");
> +            error_append_hint(errp, "This check can be disabled "
> +                              "with override-lock=on. Caution: Opening an "
> +                              "image twice can cause corruption!");

Here's where I wondered in 6/10 if it is worth providing additional
information about the current lock owner; and that information would
come from [1]...

> @@ -1164,6 +1193,17 @@ static int qcow2_open(BlockDriverState *bs, QDict 
> *options, int flags,
>          }
>      }
>  
> +    /* Set advisory lock in the header (do this as the final step so that
> +     * failure doesn't leave a locked image around) */
> +    if (!bs->read_only && !(flags & BDRV_O_INCOMING) && s->qcow_version >= 
> 3) {
> +        s->compatible_features |= QCOW2_COMPAT_IN_USE;
> +        ret = qcow2_update_header(bs);

This says that we set the advisory bit even when override-lock; the only
purpose of override lock is to allow us to write to the image even if
the bit was already set.

I suppose the other choice would be that override-lock on means that we
don't bother to set the bit at all, leaving us with the qemu 2.5
behavior of not claiming the lock and making it easier to stomp on the
image - perhaps useful for regression testing, but probably not as safe
as a default.  So I can agree with how you implemented the override.

> @@ -1272,12 +1321,32 @@ fail:
>  
>  static void qcow2_reopen_commit(BDRVReopenState *state)
>  {
> +    /* We can't fail the commit, so if the header update fails, we may end up
> +     * not protecting the image even though it is writable now. This is okay,
> +     * the lock is a best-effort service to protect the user from shooting
> +     * themselves into the foot. */

s/into/in/

> +    if (state->bs->read_only && (state->flags & BDRV_O_RDWR)) {
> +        BDRVQcow2State *s = state->bs->opaque;
> +        s->compatible_features |= QCOW2_COMPAT_IN_USE;
> +        (void) qcow2_update_header(state->bs);
> +    }
> +
>      qcow2_update_options_commit(state->bs, state->opaque);
>      g_free(state->opaque);
>  }
>  
>  static void qcow2_reopen_abort(BDRVReopenState *state)
>  {
> +    /* We can't fail the abort, so if the header update fails, we may end up
> +     * not protecting the image any more. This is okay, the lock is a
> +     * best-effort service to protect the user from shooting themselves into

s/into/in/

> @@ -1708,6 +1777,16 @@ static int qcow2_inactivate(BlockDriverState *bs)
>          qcow2_mark_clean(bs);
>      }
>  
> +    if (!bs->read_only) {
> +        s->flags |= BDRV_O_INCOMING;
> +        s->compatible_features &= ~QCOW2_COMPAT_IN_USE;
> +        ret = qcow2_update_header(bs);
> +        if (ret < 0) {
> +            result = ret;
> +            error_report("Could not update qcow2 header: %s", 
> strerror(-ret));
> +        }
> +    }
> +
>      return result;
>  }
>  

> +++ b/docs/specs/qcow2.txt
> @@ -96,7 +96,12 @@ in the description of a field.
>                                  marking the image file dirty and postponing
>                                  refcount metadata updates.
>  
> -                    Bits 1-63:  Reserved (set to 0)
> +                    Bit 1:      Locking bit. If this bit is set, then the
> +                                image is supposedly in use by some process 
> and
> +                                shouldn't be opened read-write by another
> +                                process.

...[1] I'm wondering if we should add a new optional extension header
here, which records the hostname+pid (and maybe also the argv[0]) of the
process that set this bit.

If this bit is clear, the extension header can be ignored/deleted as
useless (or more likely, rewritten the moment we open the file
read-write because we'll be setting the bit again); if this bit is set
but the extension header is not present, we have no further information
to report beyond "file is locked".  But if this bit is set and the
extension header is present, then we can attempt to tell the user more
details about who last claimed to be writing to the file (which may help
the user decide if it really still is in use, or if the lock is left
over in error due to an abrupt exit).  That also implies that the act of
setting this bit should also default to adding the new extension header,
populated with useful information.

Overall, I like the direction this series is headed in!

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org

Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to