I can comment on the second half.

Through storage operations, storage providers can create backups much faster 
than hypervisors and over time, their snapshots are more efficient than the 
snapshot chains that hypervisors create. It is true that a VM snapshot taken at 
the storage level is slightly different as it would be psuedo-quiesced, not 
have it's memory snapshotted. This is accomplished through hypervisor snapshots:

1) VM snapshot request (lets say VM 'A'
2) Create hypervisor snapshot (optional)
  -VM 'A' is snapshotted, creating active VM 'A*'
  -All disk traffic now goes to VM 'A*' and A is a snapshot of 'A*'
3) Storage driver(s) take snapshots of each volume
4) Undo hypervisor snapshot (optional)
  -VM snapshot 'A' is rolled back into VM 'A*' so the hypervisor snapshot no 
longer exists

Now, a couple notes:
-The reason this is optional is that not all users necessarily care about the 
memory or disk consistency of their VMs and would prefer faster snapshots to 
consistency.
-Preemptively, yes, we are actually taking hypervisor snapshots which means 
there isn't actually a performance of taking storage snapshots when quiescing 
the VM. However, the performance gain will come both during restoring the VM 
and during normal operations as described above.

Although you can think of it as a poor man's VM snapshot, I would think of it 
more as a consistent multi-volume snapshot. Again, the difference being that 
this snapshot was not truly quiesced like a hypervisor snapshot would be.

-- 
Chris Suich
chris.su...@netapp.com
NetApp Software Engineer
Data Center Platforms – Cloud Solutions
Citrix, Cisco & Red Hat

On Oct 8, 2013, at 1:47 PM, Darren Shepherd <darren.s.sheph...@gmail.com> wrote:

> My only comment is that having the return type as boolean and using to
> that indicate quiesce behaviour seems obscure and will probably lead
> to a problem later.  Your basically saying the result of the
> takeVMSnapshot will only ever need to communicate back whether
> unquiesce needs to happen.  Maybe some result object would be more
> extensible.
> 
> Actually, I think I have more comments.  This seems a bit odd to me.
> Why would a storage driver in ACS implement a VM snapshot
> functionality?  VM snapshot is a really a hypervisor orchestrated
> operation.  So it seems like were trying to implement a poor mans VM
> snapshot.  Maybe if I understood what NetApp was trying to do it would
> make more sense, but its all odd.  To do a proper VM snapshot you need
> to snapshot memory and disk at the exact same time.  How are we going
> to do that if ACS is orchestrating the VM snapshot and delegating to
> storage providers.  Its not like you are going to pause the VM.... or
> are you?
> 
> Darren
> 
> On Mon, Oct 7, 2013 at 11:59 AM, Edison Su <edison...@citrix.com> wrote:
>> I created a design document page at 
>> https://cwiki.apache.org/confluence/display/CLOUDSTACK/Pluggable+VM+snapshot+related+operations,
>>  feel free to add items on it.
>> And a new branch "pluggable_vm_snapshot" is created.
>> 
>>> -----Original Message-----
>>> From: SuichII, Christopher [mailto:chris.su...@netapp.com]
>>> Sent: Monday, October 07, 2013 10:02 AM
>>> To: <dev@cloudstack.apache.org>
>>> Subject: Re: [DISCUSS] Pluggable VM snapshot related operations?
>>> 
>>> I'm a fan of option 2 - this gives us the most flexibility (as you stated). 
>>> The
>>> option is given to completely override the way VM snapshots work AND
>>> storage providers are given to opportunity to work within the default VM
>>> snapshot workflow.
>>> 
>>> I believe this option should satisfy your concern, Mike. The snapshot and
>>> quiesce strategy would be in charge of communicating with the hypervisor.
>>> Storage providers should be able to leverage the default strategies and
>>> simply perform the storage operations.
>>> 
>>> I don't think it should be much of an issue that new method to the storage
>>> driver interface may not apply to everyone. In fact, that is already the 
>>> case.
>>> Some methods such as un/maintain(), attachToXXX() and takeSnapshot() are
>>> already not implemented by every driver - they just return false when asked
>>> if they can handle the operation.
>>> 
>>> --
>>> Chris Suich
>>> chris.su...@netapp.com
>>> NetApp Software Engineer
>>> Data Center Platforms - Cloud Solutions
>>> Citrix, Cisco & Red Hat
>>> 
>>> On Oct 5, 2013, at 12:11 AM, Mike Tutkowski <mike.tutkow...@solidfire.com>
>>> wrote:
>>> 
>>>> Well, my first thought on this is that the storage driver should not
>>>> be telling the hypervisor to do anything. It should be responsible for
>>>> creating/deleting volumes, snapshots, etc. on its storage system only.
>>>> 
>>>> 
>>>> On Fri, Oct 4, 2013 at 5:57 PM, Edison Su <edison...@citrix.com> wrote:
>>>> 
>>>>> In 4.2, we added VM snapshot for Vmware/Xenserver. The current
>>>>> workflow will be like the following:
>>>>> createVMSnapshot api -> VMSnapshotManagerImpl: creatVMSnapshot ->
>>>>> send CreateVMSnapshotCommand to hypervisor to create vm snapshot.
>>>>> 
>>>>> If anybody wants to change the workflow, then need to either change
>>>>> VMSnapshotManagerImpl directly or subclass VMSnapshotManagerImpl.
>>>>> Both are not the ideal choice, as VMSnapshotManagerImpl should be
>>>>> able to handle different ways to take vm snapshot, instead of hard code.
>>>>> 
>>>>> The requirements for the pluggable VM snapshot coming from:
>>>>> Storage vendor may have their optimization, such as NetApp.
>>>>> VM snapshot can be implemented in a totally different way(For
>>>>> example, I could just send a command to guest VM, to tell my
>>>>> application to flush disk and hold disk write, then come to hypervisor to
>>> take a volume snapshot).
>>>>> 
>>>>> If we agree on enable pluggable VM snapshot, then we can move on
>>>>> discuss how to implement it.
>>>>> 
>>>>> The possible options:
>>>>> 1. coarse grained interface. Add a VMSnapshotStrategy interface,
>>>>> which has the following interfaces:
>>>>>   VMSnapshot takeVMSnapshot(VMSnapshot vmSnapshot);
>>>>>   Boolean revertVMSnapshot(VMSnapshot vmSnapshot);
>>>>>   Boolean DeleteVMSnapshot(VMSnapshot vmSnapshot);
>>>>> 
>>>>>  The work flow will be: createVMSnapshot api ->
>>> VMSnapshotManagerImpl:
>>>>> creatVMSnapshot -> VMSnapshotStrategy: takeVMSnapshot
>>>>>  VMSnapshotManagerImpl will manage VM state, do the sanity check,
>>>>> then will handle over to VMSnapshotStrategy.
>>>>>  In VMSnapshotStrategy implementation, it may just send a
>>>>> Create/revert/delete VMSnapshotCommand to hypervisor host, or do
>>>>> anything special operations.
>>>>> 
>>>>> 2. fine-grained interface. Not only add a VMSnapshotStrategy
>>>>> interface, but also add certain methods on the storage driver.
>>>>>   The VMSnapshotStrategy interface will be the same as option 1.
>>>>>   Will add the following methods on storage driver:
>>>>>  /* volumesBelongToVM  is the list of volumes of the VM that created
>>>>> on this storage, storage vendor can either take one snapshot for this
>>>>> volumes in one shot, or take snapshot for each volume separately
>>>>>      The pre-condition: vm is unquiesced.
>>>>>      It will return a Boolean to indicate, do need unquiesce vm or not.
>>>>>      In the default storage driver, it will return false.
>>>>>   */
>>>>>   boolean takeVMSnapshot(List<VolumeInfo> volumesBelongToVM,
>>>>> VMSnapshot vmSnapshot);
>>>>>   Boolean revertVMSnapshot(List<VolumeInfo> volumesBelongToVM,
>>>>> VMSnapshot vmSnapshot);
>>>>>  Boolean deleteVMSnapshot(List<VolumeInfo> volumesBelongToVM,
>>>>> VMSnapshot vmSNapshot);
>>>>> 
>>>>> The work flow will be: createVMSnapshot api ->
>>> VMSnapshotManagerImpl:
>>>>> creatVMSnapshot -> VMSnapshotStrategy: takeVMSnapshot -> storage
>>>>> driver:takeVMSnapshot In the implementation of VMSnapshotStrategy's
>>>>> takeVMSnapshot, the pseudo code looks like:
>>>>>      HypervisorHelper.quiesceVM(vm);
>>>>>      val volumes = vm.getVolumes();
>>>>>      val maps = new Map[driver, list[VolumeInfo]]();
>>>>>      Volumes.foreach(volume => maps.put(volume.getDriver, volume ::
>>>>> maps.get(volume.getdriver())))
>>>>>      val needUnquiesce = true;
>>>>>       maps.foreach((driver, volumes) => needUnquiesce  =
>>>>> needUnquiesce && driver.takeVMSnapshot(volumes))
>>>>>     if (needUnquiesce ) {
>>>>>      HypervisorHelper.unquiesce(vm);
>>>>>   }
>>>>> 
>>>>> By default, the quiesceVM in HypervisorHelper will actually take vm
>>>>> snapshot through hypervisor.
>>>>> Does above logic makes senesce?
>>>>> 
>>>>> The pros of option 1 is that: it's simple, no need to change storage
>>>>> driver interfaces. The cons is that each storage vendor need to
>>>>> implement a strategy, maybe they will do the same thing.
>>>>> The pros of option 2 is that, storage driver won't need to worry
>>>>> about how to quiesce/unquiesce vm. The cons is that, it will add
>>>>> these methods on each storage drivers, so it assumes that this work
>>>>> flow will work for everybody.
>>>>> 
>>>>> So which option we should take? Or if you have other options, please
>>>>> let's know.
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>> 
>>>> 
>>>> --
>>>> *Mike Tutkowski*
>>>> *Senior CloudStack Developer, SolidFire Inc.*
>>>> e: mike.tutkow...@solidfire.com
>>>> o: 303.746.7302
>>>> Advancing the way the world uses the
>>>> cloud<http://solidfire.com/solution/overview/?video=play>
>>>> *(tm)*
>> 

Reply via email to