Re: [Xen-devel] [RFC V9 2/4] domain snapshot overview

2015-01-13 Thread Chun Yan Liu


 On 1/12/2015 at 09:54 PM, in message 1421070890.26317.69.ca...@citrix.com,
Ian Campbell ian.campb...@citrix.com wrote: 
 On Mon, 2015-01-12 at 00:01 -0700, Chun Yan Liu wrote: 
   
   On 1/8/2015 at 08:26 PM, in message 
   1420719995.19787.62.ca...@citrix.com,  
 Ian 
  Campbell ian.campb...@citrix.com wrote:  
   On Mon, 2014-12-22 at 20:42 -0700, Chun Yan Liu wrote:  
  
 On 12/19/2014 at 06:25 PM, in message   
   1418984720.20028.15.ca...@citrix.com,  
Ian Campbell ian.campb...@citrix.com wrote:   
 On Thu, 2014-12-18 at 22:45 -0700, Chun Yan Liu wrote:   
 
   On 12/18/2014 at 11:10 PM, in message
 1418915443.11882.86.ca...@citrix.com,   
  Ian Campbell ian.campb...@citrix.com wrote:
   On Tue, 2014-12-16 at 14:32 +0800, Chunyan Liu wrote:
Changes to V8:
  * add an overview document, so that one can has a overall 
look
about the whole domain snapshot work, limits, requirements, 
   
how to do, etc.

=
  

Domain snapshot overview
   
   I don't see a similar section for disk snapshots, are you not
   considering those here except as a part of a domain snapshot or 
   is  
 this
 
   an oversight?
   
   There are three main use cases (that I know of at least) for
   snapshotting like behaviour.
   
   One is as you've mentioned below for backup, i.e. to preserve 
   the VM  

   at a certain point in time in order to be able to roll back to 
   it. Is   
   
   this the only usecase you are considering?
 
  Yes. I didn't take disk snapshot thing into the scope.   
 
   
   A second use case is to support gold image type deployments, 
   i.e.
   where you create one baseline single disk image and then clone it 
  
   multiple times to deploy lots of guests. I think this is usually 
   a  
 disk   
  
   snapshot type thing, but maybe it can be implemented as 
   restoring a
  
   gold domain snapshot multiple times (e.g. for start of day 
   performance  

   reasons).
 
  As we initially discussed about the thing, disk snapshot thing can 
  be   
   done   
  be existing tools directly like qemu-img, vhd-util.   

 I was reading this section as a more generic overview of 
 snapshotting,   
 without reference to where/how things might ultimately be 
 implemented.   

 From a design point of view it would be useful to cover the various 
 use   
  
 cases, even if the solution is that the user implements them using 
 CLI   
 tools by hand (xl) or the toolstack does it for them internally   
 (libvirt).   

 This way we can more clearly see the full picture, which allows us to 
   
 validate that we are making the right choices about what goes where.  
  
  
OK. I see. I think this user case is more like how to use the snapshot, 
  
   rather  
than how to implement snapshot. Right?  
 
   Correct, what the user is actually trying to achieve with the  
   functionality.  
 
'Gold image' or 'Gold domain', the needed work is more like cloning  
 disks.  
 
   Yes, or resuming multiple times.  
   
  I see. But IMO it doesn't need change in snapshot design and  
 implementation. 
  Even resuming multiple times, they couldn't use the same image but  
 duplicate 
  the image multiple times. 
  
 Perhaps, but the use case should be included so that this rationale for 
 not worrying about it can be written down (so that people like me don't 
 keep asking...)  

Got it. Thanks!

  
   
 
   The third case, (which is similar to the first), is taking a disk 
  
   snapshot in order to be able to run you usual backup software on 
   the
  
   snapshot (which is now unchanging, which is handy) and then 
   deleting  
 the   
  
   disk snapshot (this differs from the first case in which disk is  
 active
 
   after the snapshot, and due to the lack of the memory part).
 
  Sorry, I'm still not quite clear about what this user case wants to 
  do.  
   

 The user has an active domain which they want to backup, but backup   
 software often does not cope well if the data is changing under its   
 feet.   

 So the users wants to take a snapshot of the domains disks while 
 leaving  
   
 the domain running, so they can backup that static version of the 
 disk   
 out of band from the VM itself (e.g. by attaching it to a separate   
 backup VM).   
  
Got it. So that's simply disk-only snapshot when domian is active. As 
you  
 

Re: [Xen-devel] [RFC V9 2/4] domain snapshot overview

2015-01-12 Thread Ian Campbell
On Mon, 2015-01-12 at 00:01 -0700, Chun Yan Liu wrote:
 
  On 1/8/2015 at 08:26 PM, in message 
  1420719995.19787.62.ca...@citrix.com, Ian
 Campbell ian.campb...@citrix.com wrote: 
  On Mon, 2014-12-22 at 20:42 -0700, Chun Yan Liu wrote: 

On 12/19/2014 at 06:25 PM, in message  
  1418984720.20028.15.ca...@citrix.com, 
   Ian Campbell ian.campb...@citrix.com wrote:  
On Thu, 2014-12-18 at 22:45 -0700, Chun Yan Liu wrote:  
   
  On 12/18/2014 at 11:10 PM, in message   
1418915443.11882.86.ca...@citrix.com,  
 Ian Campbell ian.campb...@citrix.com wrote:   
  On Tue, 2014-12-16 at 14:32 +0800, Chunyan Liu wrote:   
   Changes to V8:   
 * add an overview document, so that one can has a overall look  

   about the whole domain snapshot work, limits, requirements,   
   how to do, etc.   
  
   =
  
   Domain snapshot overview   
 
  I don't see a similar section for disk snapshots, are you not   
  considering those here except as a part of a domain snapshot or is 
  this   
   
  an oversight?   
 
  There are three main use cases (that I know of at least) for   
  snapshotting like behaviour.   
 
  One is as you've mentioned below for backup, i.e. to preserve the 
  VM   
  at a certain point in time in order to be able to roll back to it. 
  Is   
  this the only usecase you are considering?   
   
 Yes. I didn't take disk snapshot thing into the scope.  
   
 
  A second use case is to support gold image type deployments, i.e. 

  where you create one baseline single disk image and then clone it   
  multiple times to deploy lots of guests. I think this is usually a 
  disk  

  snapshot type thing, but maybe it can be implemented as restoring 
  a   
  gold domain snapshot multiple times (e.g. for start of day 
  performance   
  reasons).   
   
 As we initially discussed about the thing, disk snapshot thing can be 
  
  done  
 be existing tools directly like qemu-img, vhd-util.  
  
I was reading this section as a more generic overview of snapshotting,  
without reference to where/how things might ultimately be implemented.  
  
From a design point of view it would be useful to cover the various use 
 
cases, even if the solution is that the user implements them using CLI  
tools by hand (xl) or the toolstack does it for them internally  
(libvirt).  
  
This way we can more clearly see the full picture, which allows us to  
validate that we are making the right choices about what goes where.  

   OK. I see. I think this user case is more like how to use the snapshot,  
  rather 
   than how to implement snapshot. Right? 
   
  Correct, what the user is actually trying to achieve with the 
  functionality. 
   
   'Gold image' or 'Gold domain', the needed work is more like cloning 
   disks. 
   
  Yes, or resuming multiple times. 
 
 I see. But IMO it doesn't need change in snapshot design and implementation.
 Even resuming multiple times, they couldn't use the same image but duplicate
 the image multiple times.

Perhaps, but the use case should be included so that this rationale for
not worrying about it can be written down (so that people like me don't
keep asking...) 

 
   
  The third case, (which is similar to the first), is taking a disk   
  snapshot in order to be able to run you usual backup software on 
  the   
  snapshot (which is now unchanging, which is handy) and then 
  deleting the  

  disk snapshot (this differs from the first case in which disk is 
  active   
   
  after the snapshot, and due to the lack of the memory part).   
   
 Sorry, I'm still not quite clear about what this user case wants to 
 do.  
  
The user has an active domain which they want to backup, but backup  
software often does not cope well if the data is changing under its  
feet.  
  
So the users wants to take a snapshot of the domains disks while 
leaving  
the domain running, so they can backup that static version of the disk  
out of band from the VM itself (e.g. by attaching it to a separate  
backup VM).  

   Got it. So that's simply disk-only snapshot when domian is active. As you 
   mentioned below, that needs guest agent to quiesce the disks. But 
   currently 
   xen hypervisor can't support that, right? 
   
  I don't think that's relevant right now, let me explain: 
   
  I think it's important to consider all the use cases for snapshotting, 
  not because I think they need to be implemented now but to make sure 
  that we don't make any design decisions now which would make it 
  *impossible* to implement it in the future (at least 

Re: [Xen-devel] [RFC V9 2/4] domain snapshot overview

2014-12-22 Thread Chun Yan Liu


 On 12/19/2014 at 06:25 PM, in message 
 1418984720.20028.15.ca...@citrix.com,
Ian Campbell ian.campb...@citrix.com wrote: 
 On Thu, 2014-12-18 at 22:45 -0700, Chun Yan Liu wrote: 
   
   On 12/18/2014 at 11:10 PM, in message  
 1418915443.11882.86.ca...@citrix.com, 
  Ian Campbell ian.campb...@citrix.com wrote:  
   On Tue, 2014-12-16 at 14:32 +0800, Chunyan Liu wrote:  
Changes to V8:  
  * add an overview document, so that one can has a overall look  
about the whole domain snapshot work, limits, requirements,  
how to do, etc.  
  
=  
Domain snapshot overview  
 
   I don't see a similar section for disk snapshots, are you not  
   considering those here except as a part of a domain snapshot or is this  
   an oversight?  
 
   There are three main use cases (that I know of at least) for  
   snapshotting like behaviour.  
 
   One is as you've mentioned below for backup, i.e. to preserve the VM  
   at a certain point in time in order to be able to roll back to it. Is  
   this the only usecase you are considering?  
   
  Yes. I didn't take disk snapshot thing into the scope. 
   
 
   A second use case is to support gold image type deployments, i.e.  
   where you create one baseline single disk image and then clone it  
   multiple times to deploy lots of guests. I think this is usually a disk  
   snapshot type thing, but maybe it can be implemented as restoring a  
   gold domain snapshot multiple times (e.g. for start of day performance  
   reasons).  
   
  As we initially discussed about the thing, disk snapshot thing can be done 
  be existing tools directly like qemu-img, vhd-util. 
  
 I was reading this section as a more generic overview of snapshotting, 
 without reference to where/how things might ultimately be implemented. 
  
 From a design point of view it would be useful to cover the various use 
 cases, even if the solution is that the user implements them using CLI 
 tools by hand (xl) or the toolstack does it for them internally 
 (libvirt). 
  
 This way we can more clearly see the full picture, which allows us to 
 validate that we are making the right choices about what goes where. 

OK. I see. I think this user case is more like how to use the snapshot, rather
than how to implement snapshot. Right?
'Gold image' or 'Gold domain', the needed work is more like cloning disks.

  
   The third case, (which is similar to the first), is taking a disk  
   snapshot in order to be able to run you usual backup software on the  
   snapshot (which is now unchanging, which is handy) and then deleting the  
   disk snapshot (this differs from the first case in which disk is active  
   after the snapshot, and due to the lack of the memory part).  
   
  Sorry, I'm still not quite clear about what this user case wants to do. 
  
 The user has an active domain which they want to backup, but backup 
 software often does not cope well if the data is changing under its 
 feet. 
  
 So the users wants to take a snapshot of the domains disks while leaving 
 the domain running, so they can backup that static version of the disk 
 out of band from the VM itself (e.g. by attaching it to a separate 
 backup VM). 

Got it. So that's simply disk-only snapshot when domian is active. As you
mentioned below, that needs guest agent to quiesce the disks. But currently
xen hypervisor can't support that, right?

  
 This may require a guest agent to quiesce the disks. 
  
 
* ability to parse user config file  
  
  [2] Disk snapshot requirements:  
  - external tools: qemu-img, lvcreate, vhd-util, etc.  
  - for basic goal, we support 'raw' and 'qcow2' backend types  
only. Then it requires:  
libxl qmp command or qemu-img (when qemu process does not  
exist)  
  
  
3. Interaction with other operations:  
  
No.  
 
   What about shutdown/dying as you noted above? What about migration or  
   regular save/restore?  
   
  Since xl now has no idea of the existence of snapshot, 
  
 what about libvirt? This section is an overview, so making toolstack 
 specific assumptions is confusing.

Understand. I think most questions here are about a general overview vs a xl
specific view. Which I provided is xl specific, which you suggested is a
general overview. I'll update.
 
  
   so when writing this 
  document I turned to depends on users to delete snapshots before or after 
  deleting a domain (like shutdown, destroy, save, migrate away). User should 
  know where memory is saved, and disk snapshot related info. 
  
 What I meant was what happens if you try to snapshot a domain while it 
 is being shutdown or being migrated?

Ah, see. I should add words here. As described above, snapshot is not supported
when domain is being shutdown or dying.

Thanks very much for your precious time before holiday.
Merry Christmas! 


Re: [Xen-devel] [RFC V9 2/4] domain snapshot overview

2014-12-19 Thread Ian Campbell
On Thu, 2014-12-18 at 22:45 -0700, Chun Yan Liu wrote:
 
  On 12/18/2014 at 11:10 PM, in message 
  1418915443.11882.86.ca...@citrix.com,
 Ian Campbell ian.campb...@citrix.com wrote: 
  On Tue, 2014-12-16 at 14:32 +0800, Chunyan Liu wrote: 
   Changes to V8: 
 * add an overview document, so that one can has a overall look 
   about the whole domain snapshot work, limits, requirements, 
   how to do, etc. 

   = 
   Domain snapshot overview 
   
  I don't see a similar section for disk snapshots, are you not 
  considering those here except as a part of a domain snapshot or is this 
  an oversight? 
   
  There are three main use cases (that I know of at least) for 
  snapshotting like behaviour. 
   
  One is as you've mentioned below for backup, i.e. to preserve the VM 
  at a certain point in time in order to be able to roll back to it. Is 
  this the only usecase you are considering? 
 
 Yes. I didn't take disk snapshot thing into the scope.
 
   
  A second use case is to support gold image type deployments, i.e. 
  where you create one baseline single disk image and then clone it 
  multiple times to deploy lots of guests. I think this is usually a disk 
  snapshot type thing, but maybe it can be implemented as restoring a 
  gold domain snapshot multiple times (e.g. for start of day performance 
  reasons). 
 
 As we initially discussed about the thing, disk snapshot thing can be done
 be existing tools directly like qemu-img, vhd-util.

I was reading this section as a more generic overview of snapshotting,
without reference to where/how things might ultimately be implemented.

From a design point of view it would be useful to cover the various use
cases, even if the solution is that the user implements them using CLI
tools by hand (xl) or the toolstack does it for them internally
(libvirt).

This way we can more clearly see the full picture, which allows us to
validate that we are making the right choices about what goes where.

  The third case, (which is similar to the first), is taking a disk 
  snapshot in order to be able to run you usual backup software on the 
  snapshot (which is now unchanging, which is handy) and then deleting the 
  disk snapshot (this differs from the first case in which disk is active 
  after the snapshot, and due to the lack of the memory part). 
 
 Sorry, I'm still not quite clear about what this user case wants to do.

The user has an active domain which they want to backup, but backup
software often does not cope well if the data is changing under its
feet.

So the userswants to take a snapshot of the domains disks while leaving
the domain running, so they can backup that static version of the disk
out of band from the VM itself (e.g. by attaching it to a separate
backup VM).

This may require a guest agent to quiesce the disks.

   
   * ability to parse user config file 

 [2] Disk snapshot requirements: 
 - external tools: qemu-img, lvcreate, vhd-util, etc. 
 - for basic goal, we support 'raw' and 'qcow2' backend types 
   only. Then it requires: 
   libxl qmp command or qemu-img (when qemu process does not 
   exist) 


   3. Interaction with other operations: 

   No. 
   
  What about shutdown/dying as you noted above? What about migration or 
  regular save/restore? 
 
 Since xl now has no idea of the existence of snapshot,

what about libvirt? This section is an overview, so making toolstack
specific assumptions is confusing.

  so when writing this
 document I turned to depends on users to delete snapshots before or after
 deleting a domain (like shutdown, destroy, save, migrate away). User should
 know where memory is saved, and disk snapshot related info.

What I meant was what happens if you try to snapshot a domain while it
is being shutdown or being migrated? There clearly has to be some sort
of interaction, even if it is there is a global toolstack lock or the
user is advised not to do this.

Ian.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC V9 2/4] domain snapshot overview

2014-12-18 Thread Wei Liu
On Wed, Dec 17, 2014 at 08:34:07PM -0700, Chun Yan Liu wrote:
 
 
  On 12/17/2014 at 08:17 PM, in message
 20141217121750.gf1...@zion.uk.xensource.com, Wei Liu wei.l...@citrix.com
 wrote: 
  On Tue, Dec 16, 2014 at 02:32:55PM +0800, Chunyan Liu wrote: 
   Changes to V8: 
 * add an overview document, so that one can has a overall look 
   about the whole domain snapshot work, limits, requirements, 
   how to do, etc. 

   = 
   Domain snapshot overview 

   1. Purpose 

   Domain snapshot is a system checkpoint of a domain. Later, one can 
   roll back the domain to that checkpoint. It's a very useful backup 
   function. A domain snapshot contains the memory status at the 
   checkpoint and the disk status (which we called disk snapshot). 

   Domain snapshot functionality usually includes: 
   a) create a domain snapshot 
   b) roll back (or called revert) to a domain snapshot 
   c) delete a domain snapshot 
   d) list all domain snapshots 

   But following the existing xl idioms of managing storage and saved 
   VM images via existing CLI command (qemu-img, lvcreate, ls, mv, 
   cp etc), xl snapshot functionality would be kept as simple as 
   possible: 
   * xl will do a) and b), creating a snapshot and reverting a 
 domain to a snapshot. 
   * xl will NOT do c) and d), xl won't manage snapshots, as xl 
 doesn't maintain saved images created by 'xl save'. So xl 
 will have no idea of the existence of domain snapshots and 
 the chain relationship between snapshots. It will depends on 
 user to take care of the snapshots, know the snapshot chain 
 info, and delete snapshots. 

   Domain Snapshot Support and Not Support: 
   
  I think this list applies to xl (last item and [1]). If so please state 
  clearly to prevent confusion with other toolstack (say, libvirt) and 
  functionalities of the library (libxl). 
   
   * support live snapshot 
   * support internal disk snapshot and external disk snapshot 
   * support different disk backend types. 
 (Basic goal is to support 'raw' and 'qcow2' only). 

   * not support snapshot when domain is shutdowning or dying. 
   * not support disk-only snapshot [1]. 

[1] To xl, it only concerns active domains, and even when domain 
is paused, there is no data flush to disk operation. So, take 
a disk-only snapshot and then resume, it is as if the guest 
had crashed. For this reason, disk-only snapshot is meaningless 
to xl. Should not support. 

   
  I think I understand your reasoning, but it's a bit convoluted to me. 
   
  Domain can be in both active and inactive state (libvirt term) when 
  using xl.  When domain is active, we cannot guarantee in xl that domain 
  is quiesced so a disk-only snapshot may contain inconsistent data.
 
 That's right.
 
  When 
  domain is inactive, there's no point in taking a disk-only snapshot 
  because it would be the same as the base image.
 
 xl doesn't have inactive domains. Libvirt has. (in libvirt, one can 'define'
 a domain but not 'starte', like old xend which can 'new' a domain but not
 'start' it.) xl only can 'create' a domain, when domain is shutdown, it's
 not visible to user.
 

Per the definition in the first patch, inactive domain is a domain
created but not started, so I thought the domain created by xl create
-p dom.cfg falls into this category. I was wrong.

I think the created but not started should be defined but not
started (using libvirt's terminology).

 For inactive domain, disk-only snapshot is useful. Since later user
 may run VM with base image and base image would change. Then the
 disk-only snapshot is a usable backup.
 
 That's why, libvirt can support disk-only snapshot, xl won't support
 disk-only snapshot. Do I describe it clearly?
 

Yes. I think the libvirt terminology is defined, not created.

http://wiki.libvirt.org/page/VM_lifecycle

Wei.

  So the conclusion is 
  that xl doesn't need to support disk-only snapshot. 
   
  Does the above reasoning equals to yours? Is it clearer or more 
  confusing? 
   
  Wei. 
   

   2. Requirements 

   General Requirements: 
   * ability to save/restore domain memory 
   * ability to create/delete/apply disk snapshot [2] 
   * ability to parse user config file 

 [2] Disk snapshot requirements: 
 - external tools: qemu-img, lvcreate, vhd-util, etc. 
 - for basic goal, we support 'raw' and 'qcow2' backend types 
   only. Then it requires: 
   libxl qmp command or qemu-img (when qemu process does not 
   exist) 


   3. Interaction with other operations: 

   No. 


   4. General workflow 

   Create a snapshot: 
 * parse user cfg file if passed in 
 * check snapshot operation is allowed or not 
 * save domain, saving memory status to file (refer to: save_domain) 
 * take disk snapshot (e.g. call qmp command) 
 

Re: [Xen-devel] [RFC V9 2/4] domain snapshot overview

2014-12-18 Thread Ian Campbell
On Tue, 2014-12-16 at 14:32 +0800, Chunyan Liu wrote:
 Changes to V8:
   * add an overview document, so that one can has a overall look
 about the whole domain snapshot work, limits, requirements,
 how to do, etc.
 
 =
 Domain snapshot overview

I don't see a similar section for disk snapshots, are you not
considering those here except as a part of a domain snapshot or is this
an oversight?

There are three main use cases (that I know of at least) for
snapshotting like behaviour.

One is as you've mentioned below for backup, i.e. to preserve the VM
at a certain point in time in order to be able to roll back to it. Is
this the only usecase you are considering?

A second use case is to support gold image type deployments, i.e.
where you create one baseline single disk image and then clone it
multiple times to deploy lots of guests. I think this is usually a disk
snapshot type thing, but maybe it can be implemented as restoring a
gold domain snapshot multiple times (e.g. for start of day performance
reasons).

The third case, (which is similar to the first), is taking a disk
snapshot in order to be able to run you usual backup software on the
snapshot (which is now unchanging, which is handy) and then deleting the
disk snapshot (this differs from the first case in which disk is active
after the snapshot, and due to the lack of the memory part). 

Are you considering all three use cases here or are you explicitly
ruling out anything but the first? I think there might be some subtle
differences in the requirements, wrt which operations need to consider
the possibility of an active domain etc, depending on which cases are
considered. It would be good to be explicit about the use cases you are
not trying to address here so we are all on the same page.

If you are ruling these other usecases out then I think it would be
useful to briefly describe them and then note that they are out of scope
for this design, so that we have an agreed understanding of what is in
or out of scope and/or can debate to what extent such use cases ought to
be considered in the design if not the implementation.

 1. Purpose
 
 Domain snapshot is a system checkpoint of a domain. Later, one can
 roll back the domain to that checkpoint. It's a very useful backup
 function. A domain snapshot contains the memory status at the
 checkpoint and the disk status (which we called disk snapshot).


 Domain snapshot functionality usually includes:
 a) create a domain snapshot
 b) roll back (or called revert) to a domain snapshot
 c) delete a domain snapshot
 d) list all domain snapshots
 
 But following the existing xl idioms of managing storage and saved
 VM images via existing CLI command (qemu-img, lvcreate, ls, mv,
 cp etc), xl snapshot functionality would be kept as simple as
 possible:
 * xl will do a) and b), creating a snapshot and reverting a
   domain to a snapshot.
 * xl will NOT do c) and d), xl won't manage snapshots, as xl
   doesn't maintain saved images created by 'xl save'. So xl
   will have no idea of the existence of domain snapshots and
   the chain relationship between snapshots. It will depends on
   user to take care of the snapshots, know the snapshot chain
   info, and delete snapshots.

This is a case where the usecases being considered might apply. If the
third case I outlined above is in scope then xl may need to somehow
support deleting a snapshot from under the feet of an active domain etc
(which need not necessarily imply knowledge of snapshot chains or
snapshot management, but might involve a notification to the backend for
example).

 Domain Snapshot Support and Not Support:
 * support live snapshot
 * support internal disk snapshot and external disk snapshot
 * support different disk backend types.
   (Basic goal is to support 'raw' and 'qcow2' only).
 
 * not support snapshot when domain is shutdowning or dying.
 * not support disk-only snapshot [1].
 
  [1] To xl, it only concerns active domains, and even when domain
  is paused, there is no data flush to disk operation. So, take
  a disk-only snapshot and then resume, it is as if the guest
  had crashed. For this reason, disk-only snapshot is meaningless
  to xl. Should not support.
 
 
 2. Requirements
 
 General Requirements:
 * ability to save/restore domain memory
 * ability to create/delete/apply disk snapshot [2]

Is apply the same as revert to? Worth adding to the terminology
section and using consistently.

 * ability to parse user config file
 
   [2] Disk snapshot requirements:
   - external tools: qemu-img, lvcreate, vhd-util, etc.
   - for basic goal, we support 'raw' and 'qcow2' backend types
 only. Then it requires:
 libxl qmp command or qemu-img (when qemu process does not
 exist)
 
 
 3. Interaction with other operations:
 
 No.

What about shutdown/dying as you noted above? What about migration or
regular save/restore?

 
 4. General workflow
 
 Create a 

Re: [Xen-devel] [RFC V9 2/4] domain snapshot overview

2014-12-18 Thread Chun Yan Liu


 On 12/18/2014 at 11:10 PM, in message 
 1418915443.11882.86.ca...@citrix.com,
Ian Campbell ian.campb...@citrix.com wrote: 
 On Tue, 2014-12-16 at 14:32 +0800, Chunyan Liu wrote: 
  Changes to V8: 
* add an overview document, so that one can has a overall look 
  about the whole domain snapshot work, limits, requirements, 
  how to do, etc. 
   
  = 
  Domain snapshot overview 
  
 I don't see a similar section for disk snapshots, are you not 
 considering those here except as a part of a domain snapshot or is this 
 an oversight? 
  
 There are three main use cases (that I know of at least) for 
 snapshotting like behaviour. 
  
 One is as you've mentioned below for backup, i.e. to preserve the VM 
 at a certain point in time in order to be able to roll back to it. Is 
 this the only usecase you are considering? 

Yes. I didn't take disk snapshot thing into the scope.

  
 A second use case is to support gold image type deployments, i.e. 
 where you create one baseline single disk image and then clone it 
 multiple times to deploy lots of guests. I think this is usually a disk 
 snapshot type thing, but maybe it can be implemented as restoring a 
 gold domain snapshot multiple times (e.g. for start of day performance 
 reasons). 

As we initially discussed about the thing, disk snapshot thing can be done
be existing tools directly like qemu-img, vhd-util.

  
 The third case, (which is similar to the first), is taking a disk 
 snapshot in order to be able to run you usual backup software on the 
 snapshot (which is now unchanging, which is handy) and then deleting the 
 disk snapshot (this differs from the first case in which disk is active 
 after the snapshot, and due to the lack of the memory part). 

Sorry, I'm still not quite clear about what this user case wants to do.

  
 Are you considering all three use cases here or are you explicitly 
 ruling out anything but the first? I think there might be some subtle 
 differences in the requirements, wrt which operations need to consider 
 the possibility of an active domain etc, depending on which cases are 
 considered. It would be good to be explicit about the use cases you are 
 not trying to address here so we are all on the same page. 
  
 If you are ruling these other usecases out then I think it would be 
 useful to briefly describe them and then note that they are out of scope 
 for this design, so that we have an agreed understanding of what is in 
 or out of scope and/or can debate to what extent such use cases ought to 
 be considered in the design if not the implementation.

OK. I'll add this.

  
  1. Purpose 
   
  Domain snapshot is a system checkpoint of a domain. Later, one can 
  roll back the domain to that checkpoint. It's a very useful backup 
  function. A domain snapshot contains the memory status at the 
  checkpoint and the disk status (which we called disk snapshot). 
  
  
  Domain snapshot functionality usually includes: 
  a) create a domain snapshot 
  b) roll back (or called revert) to a domain snapshot 
  c) delete a domain snapshot 
  d) list all domain snapshots 
   
  But following the existing xl idioms of managing storage and saved 
  VM images via existing CLI command (qemu-img, lvcreate, ls, mv, 
  cp etc), xl snapshot functionality would be kept as simple as 
  possible: 
  * xl will do a) and b), creating a snapshot and reverting a 
domain to a snapshot. 
  * xl will NOT do c) and d), xl won't manage snapshots, as xl 
doesn't maintain saved images created by 'xl save'. So xl 
will have no idea of the existence of domain snapshots and 
the chain relationship between snapshots. It will depends on 
user to take care of the snapshots, know the snapshot chain 
info, and delete snapshots. 
  
 This is a case where the usecases being considered might apply. If the 
 third case I outlined above is in scope then xl may need to somehow 
 support deleting a snapshot from under the feet of an active domain etc 
 (which need not necessarily imply knowledge of snapshot chains or 
 snapshot management, but might involve a notification to the backend for 
 example). 
  
  Domain Snapshot Support and Not Support: 
  * support live snapshot 
  * support internal disk snapshot and external disk snapshot 
  * support different disk backend types. 
(Basic goal is to support 'raw' and 'qcow2' only). 
   
  * not support snapshot when domain is shutdowning or dying. 
  * not support disk-only snapshot [1]. 
   
   [1] To xl, it only concerns active domains, and even when domain 
   is paused, there is no data flush to disk operation. So, take 
   a disk-only snapshot and then resume, it is as if the guest 
   had crashed. For this reason, disk-only snapshot is meaningless 
   to xl. Should not support. 
   
   
  2. Requirements 
   
  General Requirements: 
  * ability to save/restore domain memory 
  * ability to 

Re: [Xen-devel] [RFC V9 2/4] domain snapshot overview

2014-12-17 Thread Wei Liu
On Tue, Dec 16, 2014 at 02:32:55PM +0800, Chunyan Liu wrote:
 Changes to V8:
   * add an overview document, so that one can has a overall look
 about the whole domain snapshot work, limits, requirements,
 how to do, etc.
 
 =
 Domain snapshot overview
 
 1. Purpose
 
 Domain snapshot is a system checkpoint of a domain. Later, one can
 roll back the domain to that checkpoint. It's a very useful backup
 function. A domain snapshot contains the memory status at the
 checkpoint and the disk status (which we called disk snapshot).
 
 Domain snapshot functionality usually includes:
 a) create a domain snapshot
 b) roll back (or called revert) to a domain snapshot
 c) delete a domain snapshot
 d) list all domain snapshots
 
 But following the existing xl idioms of managing storage and saved
 VM images via existing CLI command (qemu-img, lvcreate, ls, mv,
 cp etc), xl snapshot functionality would be kept as simple as
 possible:
 * xl will do a) and b), creating a snapshot and reverting a
   domain to a snapshot.
 * xl will NOT do c) and d), xl won't manage snapshots, as xl
   doesn't maintain saved images created by 'xl save'. So xl
   will have no idea of the existence of domain snapshots and
   the chain relationship between snapshots. It will depends on
   user to take care of the snapshots, know the snapshot chain
   info, and delete snapshots.
 
 Domain Snapshot Support and Not Support:

I think this list applies to xl (last item and [1]). If so please state
clearly to prevent confusion with other toolstack (say, libvirt) and
functionalities of the library (libxl).

 * support live snapshot
 * support internal disk snapshot and external disk snapshot
 * support different disk backend types.
   (Basic goal is to support 'raw' and 'qcow2' only).
 
 * not support snapshot when domain is shutdowning or dying.
 * not support disk-only snapshot [1].
 
  [1] To xl, it only concerns active domains, and even when domain
  is paused, there is no data flush to disk operation. So, take
  a disk-only snapshot and then resume, it is as if the guest
  had crashed. For this reason, disk-only snapshot is meaningless
  to xl. Should not support.
 

I think I understand your reasoning, but it's a bit convoluted to me.

Domain can be in both active and inactive state (libvirt term) when
using xl.  When domain is active, we cannot guarantee in xl that domain
is quiesced so a disk-only snapshot may contain inconsistent data. When
domain is inactive, there's no point in taking a disk-only snapshot
because it would be the same as the base image. So the conclusion is
that xl doesn't need to support disk-only snapshot.

Does the above reasoning equals to yours? Is it clearer or more
confusing?

Wei.

 
 2. Requirements
 
 General Requirements:
 * ability to save/restore domain memory
 * ability to create/delete/apply disk snapshot [2]
 * ability to parse user config file
 
   [2] Disk snapshot requirements:
   - external tools: qemu-img, lvcreate, vhd-util, etc.
   - for basic goal, we support 'raw' and 'qcow2' backend types
 only. Then it requires:
 libxl qmp command or qemu-img (when qemu process does not
 exist)
 
 
 3. Interaction with other operations:
 
 No.
 
 
 4. General workflow
 
 Create a snapshot:
   * parse user cfg file if passed in
   * check snapshot operation is allowed or not
   * save domain, saving memory status to file (refer to: save_domain)
   * take disk snapshot (e.g. call qmp command)
   * unpause domain
 
 Revert to snapshot:
   * parse use cfg file (xl doesn't manage snapshots, so it has no
 idea of snapshot existence. User MUST supply configuration file)
   * destroy this domain
   * create a new domain from snapshot info
 - apply disk snapshot (e.g. call qemu-img)
 - a process like restore domain

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC V9 2/4] domain snapshot overview

2014-12-17 Thread Chun Yan Liu


 On 12/17/2014 at 08:17 PM, in message
20141217121750.gf1...@zion.uk.xensource.com, Wei Liu wei.l...@citrix.com
wrote: 
 On Tue, Dec 16, 2014 at 02:32:55PM +0800, Chunyan Liu wrote: 
  Changes to V8: 
* add an overview document, so that one can has a overall look 
  about the whole domain snapshot work, limits, requirements, 
  how to do, etc. 
   
  = 
  Domain snapshot overview 
   
  1. Purpose 
   
  Domain snapshot is a system checkpoint of a domain. Later, one can 
  roll back the domain to that checkpoint. It's a very useful backup 
  function. A domain snapshot contains the memory status at the 
  checkpoint and the disk status (which we called disk snapshot). 
   
  Domain snapshot functionality usually includes: 
  a) create a domain snapshot 
  b) roll back (or called revert) to a domain snapshot 
  c) delete a domain snapshot 
  d) list all domain snapshots 
   
  But following the existing xl idioms of managing storage and saved 
  VM images via existing CLI command (qemu-img, lvcreate, ls, mv, 
  cp etc), xl snapshot functionality would be kept as simple as 
  possible: 
  * xl will do a) and b), creating a snapshot and reverting a 
domain to a snapshot. 
  * xl will NOT do c) and d), xl won't manage snapshots, as xl 
doesn't maintain saved images created by 'xl save'. So xl 
will have no idea of the existence of domain snapshots and 
the chain relationship between snapshots. It will depends on 
user to take care of the snapshots, know the snapshot chain 
info, and delete snapshots. 
   
  Domain Snapshot Support and Not Support: 
  
 I think this list applies to xl (last item and [1]). If so please state 
 clearly to prevent confusion with other toolstack (say, libvirt) and 
 functionalities of the library (libxl). 
  
  * support live snapshot 
  * support internal disk snapshot and external disk snapshot 
  * support different disk backend types. 
(Basic goal is to support 'raw' and 'qcow2' only). 
   
  * not support snapshot when domain is shutdowning or dying. 
  * not support disk-only snapshot [1]. 
   
   [1] To xl, it only concerns active domains, and even when domain 
   is paused, there is no data flush to disk operation. So, take 
   a disk-only snapshot and then resume, it is as if the guest 
   had crashed. For this reason, disk-only snapshot is meaningless 
   to xl. Should not support. 
   
  
 I think I understand your reasoning, but it's a bit convoluted to me. 
  
 Domain can be in both active and inactive state (libvirt term) when 
 using xl.  When domain is active, we cannot guarantee in xl that domain 
 is quiesced so a disk-only snapshot may contain inconsistent data.

That's right.

 When 
 domain is inactive, there's no point in taking a disk-only snapshot 
 because it would be the same as the base image.

xl doesn't have inactive domains. Libvirt has. (in libvirt, one can 'define'
a domain but not 'starte', like old xend which can 'new' a domain but not
'start' it.) xl only can 'create' a domain, when domain is shutdown, it's
not visible to user.

For inactive domain, disk-only snapshot is useful. Since later user
may run VM with base image and base image would change. Then the
disk-only snapshot is a usable backup.

That's why, libvirt can support disk-only snapshot, xl won't support
disk-only snapshot. Do I describe it clearly?

 So the conclusion is 
 that xl doesn't need to support disk-only snapshot. 
  
 Does the above reasoning equals to yours? Is it clearer or more 
 confusing? 
  
 Wei. 
  
   
  2. Requirements 
   
  General Requirements: 
  * ability to save/restore domain memory 
  * ability to create/delete/apply disk snapshot [2] 
  * ability to parse user config file 
   
[2] Disk snapshot requirements: 
- external tools: qemu-img, lvcreate, vhd-util, etc. 
- for basic goal, we support 'raw' and 'qcow2' backend types 
  only. Then it requires: 
  libxl qmp command or qemu-img (when qemu process does not 
  exist) 
   
   
  3. Interaction with other operations: 
   
  No. 
   
   
  4. General workflow 
   
  Create a snapshot: 
* parse user cfg file if passed in 
* check snapshot operation is allowed or not 
* save domain, saving memory status to file (refer to: save_domain) 
* take disk snapshot (e.g. call qmp command) 
* unpause domain 
   
  Revert to snapshot: 
* parse use cfg file (xl doesn't manage snapshots, so it has no 
  idea of snapshot existence. User MUST supply configuration file) 
* destroy this domain 
* create a new domain from snapshot info 
  - apply disk snapshot (e.g. call qemu-img) 
  - a process like restore domain 
  
  


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [RFC V9 2/4] domain snapshot overview

2014-12-15 Thread Chunyan Liu
Changes to V8:
  * add an overview document, so that one can has a overall look
about the whole domain snapshot work, limits, requirements,
how to do, etc.

=
Domain snapshot overview

1. Purpose

Domain snapshot is a system checkpoint of a domain. Later, one can
roll back the domain to that checkpoint. It's a very useful backup
function. A domain snapshot contains the memory status at the
checkpoint and the disk status (which we called disk snapshot).

Domain snapshot functionality usually includes:
a) create a domain snapshot
b) roll back (or called revert) to a domain snapshot
c) delete a domain snapshot
d) list all domain snapshots

But following the existing xl idioms of managing storage and saved
VM images via existing CLI command (qemu-img, lvcreate, ls, mv,
cp etc), xl snapshot functionality would be kept as simple as
possible:
* xl will do a) and b), creating a snapshot and reverting a
  domain to a snapshot.
* xl will NOT do c) and d), xl won't manage snapshots, as xl
  doesn't maintain saved images created by 'xl save'. So xl
  will have no idea of the existence of domain snapshots and
  the chain relationship between snapshots. It will depends on
  user to take care of the snapshots, know the snapshot chain
  info, and delete snapshots.

Domain Snapshot Support and Not Support:
* support live snapshot
* support internal disk snapshot and external disk snapshot
* support different disk backend types.
  (Basic goal is to support 'raw' and 'qcow2' only).

* not support snapshot when domain is shutdowning or dying.
* not support disk-only snapshot [1].

 [1] To xl, it only concerns active domains, and even when domain
 is paused, there is no data flush to disk operation. So, take
 a disk-only snapshot and then resume, it is as if the guest
 had crashed. For this reason, disk-only snapshot is meaningless
 to xl. Should not support.


2. Requirements

General Requirements:
* ability to save/restore domain memory
* ability to create/delete/apply disk snapshot [2]
* ability to parse user config file

  [2] Disk snapshot requirements:
  - external tools: qemu-img, lvcreate, vhd-util, etc.
  - for basic goal, we support 'raw' and 'qcow2' backend types
only. Then it requires:
libxl qmp command or qemu-img (when qemu process does not
exist)


3. Interaction with other operations:

No.


4. General workflow

Create a snapshot:
  * parse user cfg file if passed in
  * check snapshot operation is allowed or not
  * save domain, saving memory status to file (refer to: save_domain)
  * take disk snapshot (e.g. call qmp command)
  * unpause domain

Revert to snapshot:
  * parse use cfg file (xl doesn't manage snapshots, so it has no
idea of snapshot existence. User MUST supply configuration file)
  * destroy this domain
  * create a new domain from snapshot info
- apply disk snapshot (e.g. call qemu-img)
- a process like restore domain

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel