On 3/7/19 12:47 AM, Eric Blake wrote:
> Upcoming patches will add support for incremental backups via
> a new API; but first, we need a landing page that gives an
> overview of capturing various pieces of guest state, and which
> APIs are best suited to which tasks.
>
> Signed-off-by: Eric Blake <ebl...@redhat.com>
>
> ---
> v2: wording improvements based on review
> ---
> docs/docs.html.in | 5 +
> docs/domainstatecapture.html.in | 314 ++++++++++++++++++++++++++++++++
> docs/formatsnapshot.html.in | 2 +
> 3 files changed, 321 insertions(+)
> create mode 100644 docs/domainstatecapture.html.in
> > diff --git a/docs/docs.html.in b/docs/docs.html.in
> index d0ff844d0c..3afd13080a 100644
> --- a/docs/docs.html.in
> +++ b/docs/docs.html.in
> @@ -121,6 +121,11 @@
>
> <dt><a href="secureusage.html">Secure usage</a></dt>
> <dd>Secure usage of the libvirt APIs</dd>
> +
> + <dt><a href="domainstatecapture.html">Domain state
> + capture</a></dt>
> + <dd>Comparison between different methods of capturing domain
> + state</dd>
> </dl>
> </div>
>
> diff --git a/docs/domainstatecapture.html.in b/docs/domainstatecapture.html.in
> new file mode 100644
> index 0000000000..f7f2fe0b98
> --- /dev/null
> +++ b/docs/domainstatecapture.html.in
> @@ -0,0 +1,314 @@
> +<?xml version="1.0" encoding="UTF-8"?>
> +<!DOCTYPE html>
> +<html xmlns="http://www.w3.org/1999/xhtml">
> + <body>
> +
> + <h1>Domain state capture using Libvirt</h1>
> +
> + <ul id="toc"></ul>
> +
> + <p>
> + In order to aid application developers to choose which
> + operations best suit their needs, this page compares the
> + different means for capturing state related to a domain managed
> + by libvirt.
> + </p>
> +
> + <p>
> + The information here is primarily geared towards capturing the
> + state of an active domain. Capturing the state of an inactive
> + domain essentially amounts to copying the contents of guest
> + disks, followed by a fresh boot with disks restored to that
> + state. Some of the topics presented below may relate to inactive
> + state collection, but it is not the primary focus of this page.
Perhaps the last sentence is redundant or unnecessary, IDC.
> + </p>
> +
> + <h2><a id="definitions">State capture trade-offs</a></h2>
> +
> + <p>One of the features made possible with virtual machines is live
> + migration -- transferring all state related to the guest from
> + one host to another with minimal interruption to the guest's
> + activity. In this case, state includes domain memory (including
> + register and device contents), and domain storage (whether the
> + guest's view of the disks are backed by local storage on the
> + host, or by the hypervisor accessing shared storage over a
> + network). A clever observer will then note that if all state is
> + available for live migration, then there is nothing stopping a
> + user from saving some or all of that state at a given point of
> + time in order to be able to later rewind guest execution back to
> + the state it previously had. The astute reader will also realize
> + that state capture at any level requires that the data must be
> + stored and managed by some mechanism. This processing might fit
> + in a single file, or more likely require a chain of related
> + files, and may require synchronization with third-party tools
> + built around managing the amount of data resulting from
> + capturing the state of multiple guests that each use multiple
> + disks.
> + </p>
> +
> + <p>
> + There are several libvirt APIs associated with capturing the
> + state of a guest, which can later be used to rewind that guest
> + to the conditions it was in earlier. The following is a list of
> + trade-offs and differences between the various facets that
> + affect capturing domain state for active domains:
> + </p>
> +
> + <dl>
> + <dt>Duration</dt>
> + <dd>Capturing state can be a lengthy process, so while the
> + captured state ideally represents an atomic point in time
> + correpsonding to something the guest was actually executing,
corresponding
> + capturing state tends to focus on minimizing guest downtime
> + while performing the rest of the state capture in parallel
> + with guest execution. Some interfaces require up-front
> + preparation (the state captured is not complete until the API
> + ends, which may be some time after the command was first
> + started), while other interfaces track the state when the
> + command was first issued, regardless of the time spent in
> + capturing the rest of the state. Also, time spent in state
> + capture may be longer than the time required for live
> + migration, when state must be duplicated rather than shared.
> + </dd>
> +
> + <dt>Amount of state</dt>
> + <dd>For an online guest, there is a choice between capturing the
> + guest's memory (all that is needed during live migration when
> + the storage is already shared between source and destination),
> + the guest's disk state (all that is needed if there are no
> + pending guest I/O transactions that would be lost without the
> + corresponding memory state), or both together. Reverting to
> + partial state may still be viable, but typically, booting from
> + captured disk state without corresponding memory is comparable
> + to rebooting a machine that had power cut before I/O could be
> + flushed. Guests may need to use proper journaling methods to
> + avoid problems when booting from partial state.
> + </dd>
> +
> + <dt>Quiescing of data</dt>
> + <dd>Even if a guest has no pending I/O, capturing disk state may
> + catch the guest at a time when the contents of the disk are
> + inconsistent. Cooperating with the guest to perform data
> + quiescing is an optional step to ensure that captured disk
> + state is fully consistent without requiring additional memory
> + state, rather than just crash-consistent. But guest
> + cooperation may also have time constraints, where the guest
> + can rightfully panic if there is too much downtime while I/O
> + is frozen.
> + </dd>
> +
> + <dt>Quantity of files</dt>
> + <dd>When capturing state, some approaches store all state within
> + the same file (internal), while others expand a chain of
> + related files that must be used together (external), for more
> + files that a management application must track.
> + </dd>
> +
> + <dt>Impact to guest definition</dt>
> + <dd>Capturing state may require temporary changes to the guest
> + definition, such as associating new files into the domain
> + definition. While state capture should never impact the
> + running guest, a change to the domain's active XML may have
> + impact on other host operations being performed on the domain.
> + </dd>
> +
> + <dt>Third-party integration</dt>
> + <dd>When capturing state, there are tradeoffs to how much of the
> + process must be done directly by the hypervisor, and how much
> + can be off-loaded to third-party software. Since capturing
> + state is not instantaneous, it is essential that any
> + third-party integration see consistent data even if the
> + running guest continues to modify that data after the point in
> + time of the capture.</dd>
> +
> + <dt>Full vs. incremental</dt>
> + <dd>When periodically repeating the action of state capture, it
> + is useful to minimize the amount of state that must be
> + captured by exploiting the relation to a previous capture,
> + such as focusing only on the portions of the disk that the
> + guest has modified in the meantime. Some approaches are able
> + to take advantage of checkpoints to provide an incremental
> + backup, while others are only capable of a full backup even if
> + that means re-capturing unchanged portions of the disk.</dd>
> +
> + <dt>Local vs. remote</dt>
> + <dd>Domains that completely use remote storage may only need
> + some mechanism to keep track of guest memory state while using
> + external means to manage storage. Still, hypervisor and guest
> + cooperation to ensure points in time when no I/O is in flight
> + across the network can be important for properly capturing
> + disk state.</dd>
> +
> + <dt>Network latency</dt>
> + <dd>Whether it's domain storage or saving domain state into
> + remote storage, network latency has an impact on snapshot
> + data. Having dedicated network capacity, bandwidth, or quality
> + of service levels may play a role, as well as planning for how
> + much of the backup process needs to be local.</dd>
> + </dl>
> +
> + <p>
> + An example of the various facets in action is migration of a
> + running guest. In order for the guest to be able to resume on
> + the destination at the same place it left off at the source, the
> + hypervisor has to get to a point where execution on the source
> + is stopped, the last remaining changes occurring since the
> + migration started are then transferred, and the guest is started
> + on the target. The management software thus must keep track of
> + the starting point and any changes since the starting
> + point. These last changes are often referred to as dirty page
> + tracking or dirty disk block bitmaps. At some point in time
> + during the migration, the management software must freeze the
> + source guest, transfer the dirty data, and then start the guest
> + on the target. This period of time must be minimal. To minimize
> + overall migration time, one is advised to use a dedicated
> + network connection with a high quality of service. Alternatively
> + saving the current state of the running guest can just be a
> + point in time type operation which doesn't require updating the
> + "last vestiges" of state prior to writing out the saved state
> + file. The state file is the point in time of whatever is current
> + and may contain incomplete data which if used to restart the
> + guest could cause confusion or problems because some operation
> + wasn't completed depending upon where in time the operation was
> + commenced.
> + </p>
> +
> + <h2><a id="apis">State capture APIs</a></h2>
> + <p>With those definitions, the following libvirt APIs related to
> + state capture have these properties:</p>
> + <dl>
> + <dt>virDomainManagedSave</dt>
Do you think it'd be worthwhile to modify to:
<code><a
href="html/libvirt-libvirt-domain.html#virDomainManagedSave">virDomainManagedSave</a></code>
> + <dd>This API saves guest memory, with libvirt managing all of
> + the saved state, then stops the guest. While stopped, the
> + disks can be copied by a third party. However, since any
> + subsequent restart of the guest by libvirt API will restore
> + the memory state (which typically only works if the disk state
> + is unchanged in the meantime), and since it is not possible to
> + get at the memory state that libvirt is managing, this is not
> + viable as a means for rolling back to earlier saved states,
> + but is rather more suited to situations such as suspending a
> + guest prior to rebooting the host in order to resume the guest
> + when the host is back up. This API also has a drawback of
> + potentially long guest downtime, and therefore does not lend
> + itself well to live backups.</dd>
> +
> + <dt>virDomainSave</dt>
<code><a
href="html/libvirt-libvirt-domain.html#virDomainSave">virDomainSave</a></code>
> + <dd>This API is similar to virDomainManagedSave(), but moves the
s/()// or add them above. I like without (), but don't really care.
Just make them all consistent - above and below.
> + burden on managing the stored memory state to the user. As
> + such, the user can now couple saved state with copies of the
> + disks to perform a revert to an arbitrary earlier saved state.
> + However, changing who manages the memory state does not change
> + the drawback of potentially long guest downtime when capturing
> + state.</dd>
> +
> + <dt>virDomainSnapshotCreateXML()</dt>
<code><a
href="html/libvirt-libvirt-domain-snapshot.html#virDomainSnapshotCreateXML">virDomainSnapshotCreateXML</a></code>
> + <dd>This API wraps several approaches for capturing guest state,
> + with a general premise of creating a snapshot (where the
> + current guest resources are frozen in time and a new wrapper
> + layer is opened for tracking subsequent guest changes). It
> + can operate on both offline and running guests, can choose
> + whether to capture the state of memory, disk, or both when
> + used on a running guest, and can choose between internal and
> + external storage for captured state. However, it is geared
> + towards post-event captures (when capturing both memory and
> + disk state, the disk state is not captured until all memory
> + state has been collected first). Using QEMU as the
> + hypervisor, internal snapshots currently have lengthy downtime
> + that is incompatible with freezing guest I/O, but external
> + snapshots are quick. Since creating an external snapshot
> + changes which disk image resource is in use by the guest, this
> + API can be coupled with <code>virDomainBlockCommit()</code> to
> + restore things back to the guest using its original disk
> + image, where a third-party tool can read the backing file
> + prior to the live commit. See also
> + the <a href="formatsnapshot.html">XML details</a> used with
> + this command.</dd>
> +
> + <dt>virDomainFSFreeze(), virDomainFSThaw()</dt>
<code><a
href="html/libvirt-libvirt-domain.html#virDomainFSFreeze">virDomainFSFreeze</a></code>,
<code><a
href="html/libvirt-libvirt-domain.html#virDomainFSThaw">virDomainFSThaw</a></code>
> + <dd>This pair of APIs does not directly capture guest state, but
> + can be used to coordinate with a trusted live guest that state
> + capture is about to happen, and therefore guest I/O should be
> + quiesced so that the state capture is fully consistent, rather
> + than merely crash consistent. Some APIs are able to
> + automatically perform a freeze and thaw via a flags parameter,
> + rather than having to make separate calls to these
> + functions. Also, note that freezing guest I/O is only possible
> + with trusted guests running a guest agent, and that some
> + guests place maximum time limits on how long I/O can be
> + frozen.</dd>
> +
> + <dt>virDomainBlockCopy()</dt>
<code><a
href="html/libvirt-libvirt-domain.html#virDomainBlockCopy">virDomainBlockCopy</a></code>
> + <dd>This API wraps approaches for capturing the disk state (but
> + not memory) of a running guest, but does not track
> + accompanying guest memory state, but can only operate on one
> + block device per job. To get a consistent copy of multiple
> + disks, multiple jobs just be run in parallel, then the domain
> + must be paused before ending all of the jobs. The capture is
> + consistent only at the end of the operation with a choice for
> + future guest changes to either pivot to the new file or to
> + resume to just using the original file. The resulting backup
> + file is thus the other file no longer in use by the
> + guest.</dd>
> +
> + <dt>virDomainCheckpointCreateXML()</dt>
<code><a
href="html/libvirt-libvirt-domain-checkpoint.html#virDomainCheckpointCreateXML">virDomainCheckpointCreateXML</a></code>
Since this and the next two following don't have links yet, I think
rather than do any sort of split, can we move this to after the
virDomainBackup* API's are introduced? It's been great to help lay the
groundwork though.
> + <dd>This API does not actually capture guest state, rather it
> + makes it possible to track which portions of guest disks have
> + changed between a checkpoint and the current live execution of
> + the guest. However, while it is possible use this API to
> + create checkpoints in isolation, it is more typical to create
> + a checkpoint as a side-effect of starting a new incremental
> + backup with <code>virDomainBackupBegin()</code>, since a
> + second incremental backup is most useful when using the
> + checkpoint created during the first. <!--See also
> + the <a href="formatcheckpoint.html">XML details</a> used with
> + this command.--></dd>
Making this patch later in the series removes the need for this too.
> +
> + <dt>virDomainBackupBegin(), virDomainBackupEnd()</dt>
<code><a
href="html/libvirt-libvirt-domain.html#virDomainBackupBegin">virDomainBackupBegin</a></code>,
<code><a
href="html/libvirt-libvirt-domain.html#virDomainBackupEnd">virDomainBackupEnd</a></code>
> + <dd>This API wraps approaches for capturing the state of disks
> + of a running guest, but does not track accompanying guest
> + memory state. The capture is consistent to the start of the
> + operation, where the captured state is stored independently
> + from the disk image in use with the guest and where it can be
> + easily integrated with a third-party for capturing the disk
> + state. Since the backup operation is stored externally from
> + the guest resources, there is no need to commit data back in
> + at the completion of the operation. When coupled with
> + checkpoints, this can be used to capture incremental backups
> + instead of full.</dd>
> + </dl>
> +
> + <h2><a id="examples">Examples</a></h2>
> + <p>The following two sequences both accomplish the task of
> + capturing the disk state of a running guest, then wrapping
> + things up so that the guest is still running with the same file
> + as its disk image as before the sequence of operations began.
> + The difference between the two sequences boils down to the
> + impact of an unexpected interruption made at any point in the
> + middle of the sequence: with such an interruption, the first
> + example leaves the guest tied to a temporary wrapper file rather
> + than the original disk, and requires manual clean up of the
> + domain definition; while the second example has no impact to the
> + domain definition.</p>
> +
> + <p>1. Backup via temporary snapshot
> + <pre>
> +virDomainFSFreeze()
> +virDomainSnapshotCreateXML(VIR_DOMAIN_SNAPSHOT_CREATE_DISK_ONLY)
> +virDomainFSThaw()
> +third-party copy the backing file to backup storage # most time spent here
> +virDomainBlockCommit(VIR_DOMAIN_BLOCK_COMMIT_ACTIVE) per disk
> +wait for commit ready event per disk
> +virDomainBlockJobAbort() per disk
> + </pre></p>
> +
> + <p>2. Direct backup
> + <pre>
> +virDomainFSFreeze()
> +virDomainBackupBegin()
> +virDomainFSThaw()
> +wait for push mode event, or pull data over NBD # most time spent here
> +virDomainBackeupEnd()
virDomainBackupEnd
Reviewed-by: John Ferlan <jfer...@redhat.com>
John
> + </pre></p>
> +
> + </body>
> +</html>
> diff --git a/docs/formatsnapshot.html.in b/docs/formatsnapshot.html.in
> index c60b4fb7c9..9ee355198f 100644
> --- a/docs/formatsnapshot.html.in
> +++ b/docs/formatsnapshot.html.in
> @@ -9,6 +9,8 @@
> <h2><a id="SnapshotAttributes">Snapshot XML</a></h2>
>
> <p>
> + Snapshots are one form
> + of <a href="domainstatecapture.html">domain state capture</a>.
> There are several types of snapshots:
> </p>
> <dl>
>
--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list