My initial tendency here is to stick with standard hypervisor snapshots for 4.3 for block storage.
I am curious how this work pans out, though, so I plan to keep up to date with it. Thanks! On Thu, Oct 10, 2013 at 9:06 AM, SuichII, Christopher < chris.su...@netapp.com> wrote: > Hm, that is tricky. I haven't looked into block stuff too much, but maybe > we can… > > -Create another temporary lun &, register it as a SR/DS > -Move the active VM to that SR > -Ask the storage driver to snapshot. > -Move the active VM back to the original SR > -Delete the temporary lun and SR > -Delete the snapshot, causing the snapshot and active VM to be combined > into one VDI again > > This way, the only file on the original lun when the driver snapshot > occurs is the hypervisor snapshot. Maybe this is way too much work and to > hackish, though. > > Alternatively... > Maybe initially, quiesce on block storage wouldn't be supported? If > quiesce was requested on block storage, then the driver could say it isn't > supported and have the default implementation take an actual hypervisor > snapshot and keep it. If quiesce wasn't supported, then you can just take a > snapshot of the lun and not worry about consistency. > > I know that this is not a new issue, though. Snapshotting with block > storage is always more difficult than with NFS. > > -- > Chris Suich > chris.su...@netapp.com > NetApp Software Engineer > Data Center Platforms – Cloud Solutions > Citrix, Cisco & Red Hat > > On Oct 10, 2013, at 10:37 AM, Mike Tutkowski <mike.tutkow...@solidfire.com> > wrote: > > > I wonder if this technique is only going to work for NFS? > > > > In the block world, the VDI we take a snapshot of on the SR will lead to > > the creation of another VDI and a block system cannot just snapshot the > > hypervisor snapshot - it needs to snapshot the entire volume (which is > > analogous to the SR). > > > > > > On Thu, Oct 10, 2013 at 6:29 AM, SuichII, Christopher < > > chris.su...@netapp.com> wrote: > > > >> Multivendor snapshotting: > >> The case with two storage providers is a bit trickier and is one that we > >> are still working on. I believe there are a couple options on the table: > >> > >> -Give both storage providers the option to take the snapshot and fail if > >> either one fails or cannot take the snapshot > >> -Give both storage providers the option to take the snapshot and use the > >> hypervisor/default if either one fails or cannot take the snapshot > >> -Fall back to using the hypervisor/default if the VM has volumes on > >> storage managed by different providers > >> > >> The only purpose of the hypervisor snapshot is to give storage > providers a > >> consistent volume to take their snapshot against. Once that snapshot is > >> taken, the hypervisor snapshot is pushed back into the parent or active > VM > >> (essentially removing the fact the hypervisor snapshot ever existed). > >> > >> > >> Quiescing: > >> This is something that has been debated a lot. Ultimately, one reason > for > >> having drivers perform the quiescing is because we don't know how every > >> storage provider will want to work. As far as I've ever known, any > storage > >> provider that wants to create the snapshots themselves will want the VM > to > >> be quiesced through the hypervisor. However, there may be some storage > >> provider that has some way of taking snapshots (that we don't know > about) > >> that doesn't require the VM to be quiesced. In that case, we wouldn't > want > >> them to be forced into having the VM quiesced before they're asked to > take > >> the snapshot. > >> > >> > >> Two snapshot methods: > >> I believe the main reason for this is that storage drivers may want to > >> take the snapshot differently depending on whether it is a single volume > >> snapshot or an entire VM snapshot. Again, erring on the side of > flexibility > >> so that things don't have to change when a new storage provider comes > along > >> with different requirements. > >> > >> > >> -- > >> Chris Suich > >> chris.su...@netapp.com > >> NetApp Software Engineer > >> Data Center Platforms – Cloud Solutions > >> Citrix, Cisco & Red Hat > >> > >> On Oct 10, 2013, at 1:40 AM, Mike Tutkowski < > mike.tutkow...@solidfire.com> > >> wrote: > >> > >>> "The work flow will be: createVMSnapshot api -> VMSnapshotManagerImpl: > >>> creatVMSnapshot -> VMSnapshotStrategy: takeVMSnapshot -> storage > >>> driver:takeVMSnapshot" > >>> > >>> I also think it's a bit weird for the storage driver to have any > >> knowledge > >>> of VM snapshots. > >>> > >>> I would think another part of the system would quiesce (or not) the VM > in > >>> question and then the takeSnapshot method would be called on the > driver. > >>> > >>> I might have missed something...why does the driver "care" if the > >> snapshot > >>> to be taken is going to be in a consistent state or not (I understand > why > >>> the user care, but not the storage driver)? Why is that not a problem > for > >>> some other part of the system that is aware of hypervisor snapshots? > >>> Shouldn't the driver just take a snapshot (or snapshots) as it is > >>> instructed to do (regardless of whether or not a VM is quiesced)? > >>> > >>> Basically I'm wondering why we need two "take snapshot" methods on the > >>> driver. > >>> > >>> > >>> On Wed, Oct 9, 2013 at 11:24 PM, Mike Tutkowski < > >>> mike.tutkow...@solidfire.com> wrote: > >>> > >>>> Yeah, I'm not really clear how the snapshot strategy works if you have > >>>> multiple vendors that implement that interface either. > >>>> > >>>> > >>>> On Wed, Oct 9, 2013 at 10:12 PM, Darren Shepherd < > >>>> darren.s.sheph...@gmail.com> wrote: > >>>> > >>>>> Edison, > >>>>> > >>>>> I would lean toward doing the coarse grain interface only. I'm > having > >>>>> a hard time seeing how the whole flow is generic and makes sense for > >>>>> everyone. With starting with the coarse grain you have the advantage > >>>>> in that you avoid possible upfront over engineering/over design that > >>>>> could wreak havoc down the line. If you implement the > >>>>> VMSnapshotStrategy and find that it really is useful to other > >>>>> implementations, you can then implement the fine grain interface > later > >>>>> to allow others to benefit from it. > >>>>> > >>>>> Darren > >>>>> > >>>>> On Wed, Oct 9, 2013 at 8:54 PM, Mike Tutkowski > >>>>> <mike.tutkow...@solidfire.com> wrote: > >>>>>> Hey guys, > >>>>>> > >>>>>> I haven't been giving this thread much attention, but am reviewing > it > >>>>>> somewhat now. > >>>>>> > >>>>>> I'm not really clear how this would work if, say, a VM has two data > >>>>> disks > >>>>>> and they are not being provided by the same vendor. > >>>>>> > >>>>>> Can someone clarify that for me? > >>>>>> > >>>>>> My understanding for how this works today is that it doesn't matter. > >> For > >>>>>> XenServer, a VDI is on an SR, which could be supported by storage > >>>>> vendor X. > >>>>>> Another VDI could be on another SR, supported by storage vendor Y. > >>>>>> > >>>>>> In this case, a new VDI appears on each SR after a hypervisor > >> snapshot. > >>>>>> > >>>>>> Same idea for VMware. > >>>>>> > >>>>>> I don't really know how (or if) this works for KVM. > >>>>>> > >>>>>> I'm not clear how this multi-vendor situation would play out in this > >>>>>> pluggable approach. > >>>>>> > >>>>>> Thanks! > >>>>>> > >>>>>> > >>>>>> On Tue, Oct 8, 2013 at 4:43 PM, Edison Su <edison...@citrix.com> > >> wrote: > >>>>>> > >>>>>>> > >>>>>>> > >>>>>>>> -----Original Message----- > >>>>>>>> From: Darren Shepherd [mailto:darren.s.sheph...@gmail.com] > >>>>>>>> Sent: Tuesday, October 08, 2013 2:54 PM > >>>>>>>> To: dev@cloudstack.apache.org > >>>>>>>> Subject: Re: [DISCUSS] Pluggable VM snapshot related operations? > >>>>>>>> > >>>>>>>> A hypervisor snapshot will snapshot memory also. So determining > >>>>> whether > >>>>>>> The memory is optional for hypervisor vm snapshot, a.k.a, the > >>>>> "Disk-only > >>>>>>> snapshots": > >>>>>>> > >>>>> > >> > http://support.citrix.com/proddocs/topic/xencenter-61/xs-xc-vms-snapshots-about.html > >>>>>>> It's supported by both xenserver/kvm/vmware. > >>>>>>> > >>>>>>>> do to the hypervisor snapshot from the quiesce option does not > seem > >>>>>>>> proper. > >>>>>>>> > >>>>>>>> Sorry, for all the questions, I'm trying to get to the point of > >>>>>>> understand if this > >>>>>>>> functionality makes sense at this point of code or if maybe their > is > >>>>> a > >>>>>>> different > >>>>>>>> approach. This is what I'm seeing, what if we state it this way > >>>>>>>> > >>>>>>>> 1) VM snapshot, AFAIK, are not backed up today and exist solely on > >>>>>>> primary. > >>>>>>>> What if we added a backup phase to VM snapshots that can be > >>>>> optionally > >>>>>>>> supported by the storage providers to possibly backup the VM > >> snapshot > >>>>>>>> volumes. > >>>>>>> It's not about backup vm snapshot, it's about how to take vm > >> snapshot. > >>>>>>> Usually, take/revert vm snapshot is handled by hypervisor itself, > but > >>>>> in > >>>>>>> NetApp(or other storage vendor) case, > >>>>>>> They want to change the default behavior of hypervisor-base vm > >>>>> snapshot. > >>>>>>> > >>>>>>> Some examples: > >>>>>>> 1. take hypervisor based vm snapshots, on primary storage, > hypervisor > >>>>> will > >>>>>>> maintain the snapshot chain. > >>>>>>> 2. take vm snapshot through NetApp: > >>>>>>> a. first, quiesce VM if user specified. There is no separate API > >>>>> to > >>>>>>> quiesce VM on the hypervisor, so here we will > >>>>>>> take a VM snapshot through hypervisor API call, hypervisor will > take > >>>>>>> volume snapshot on each volume of the VM. Let's say, on the > primary > >>>>>>> storage, the disk chain looks like: > >>>>>>> base-image > >>>>>>> | > >>>>>>> V > >>>>>>> Parent disk > >>>>>>> / \ > >>>>>>> V V > >>>>>>> Current disk snapshot-a > >>>>>>> b. from snapshot-a, find out its parent disk, then take snapshot > >>>>>>> through NetApp > >>>>>>> c. un- quiesce VM, here, go to hypervisor, delete snapshot > >>>>>>> "snapshot-a", hypervisor should be able to consolidate current disk > >> and > >>>>>>> "parent disk" into one disk, thus from hypervisor point of view > >>>>>>> , there is always, at most, only one snapshot for the VM. > >>>>>>> For revert VM snapshot, as long as the VM is stopped, NetApp can > >>>>>>> revert the snapshot created on NetApp storage easily, and > >> efficiently. > >>>>>>> The benefit of this whole process, as Chris pointed out, if the > >>>>>>> snapshot chain is quite long, hypervisor based VM snapshot will get > >>>>>>> performance hit. > >>>>>>> > >>>>>>>> > >>>>>>>> 2) Additionally you want to be able to backup multiple disks at > >> once, > >>>>>>>> regardless of VM snapshot. Why don't we add the ability to put > >>>>>>> volumeIds in > >>>>>>>> snapshot cmd that if the storage provider supports it will get a > >>>>> batch of > >>>>>>>> volumeIds. > >>>>>>>> > >>>>>>>> Now I know we talked about 2 and there was some concerns about it > >>>>> (mostly > >>>>>>>> from me), but I think we could work through those concerns (forgot > >>>>> what > >>>>>>>> they were...). Right now I just get the feeling we are > shoehorning > >>>>> some > >>>>>>>> functionality into VM snapshot that isn't quite the right fit. > The > >>>>> "no > >>>>>>> quiesce" > >>>>>>>> flow just doesn't seem to make sense to me. > >>>>>>> > >>>>>>> > >>>>>>> Not sure above NetApp proposed work flow makes sense to you or to > >> other > >>>>>>> body or not. If this work flow is only specific to NetApp, then we > >>>>> don't > >>>>>>> need to enforce the whole process for everybody. > >>>>>>> > >>>>>>>> > >>>>>>>> Darren > >>>>>>>> > >>>>>>>> On Tue, Oct 8, 2013 at 2:05 PM, SuichII, Christopher > >>>>>>>> <chris.su...@netapp.com> wrote: > >>>>>>>>> Whether the hypervisor snapshot happens depends on whether the > >>>>>>>> 'quiesce' option is specified with the snapshot request. If a user > >>>>>>> doesn't care > >>>>>>>> about the consistency of their backup, then the hypervisor > >>>>>>> snapshot/quiesce > >>>>>>>> step can be skipped altogether. This of course is not the case if > >> the > >>>>>>> default > >>>>>>>> provider is being used, in which case a hypervisor snapshot is the > >>>>> only > >>>>>>> way of > >>>>>>>> creating a backup since it can't be offloaded to the storage > driver. > >>>>>>>>> > >>>>>>>>> -- > >>>>>>>>> Chris Suich > >>>>>>>>> chris.su...@netapp.com > >>>>>>>>> NetApp Software Engineer > >>>>>>>>> Data Center Platforms - Cloud Solutions Citrix, Cisco & Red Hat > >>>>>>>>> > >>>>>>>>> On Oct 8, 2013, at 4:57 PM, Darren Shepherd > >>>>>>>>> <darren.s.sheph...@gmail.com> > >>>>>>>>> wrote: > >>>>>>>>> > >>>>>>>>>> Who is going to decide whether the hypervisor snapshot should > >>>>>>>>>> actually happen or not? Or how? > >>>>>>>>>> > >>>>>>>>>> Darren > >>>>>>>>>> > >>>>>>>>>> On Tue, Oct 8, 2013 at 12:38 PM, SuichII, Christopher > >>>>>>>>>> <chris.su...@netapp.com> wrote: > >>>>>>>>>>> > >>>>>>>>>>> -- > >>>>>>>>>>> Chris Suich > >>>>>>>>>>> chris.su...@netapp.com > >>>>>>>>>>> NetApp Software Engineer > >>>>>>>>>>> Data Center Platforms - Cloud Solutions Citrix, Cisco & Red Hat > >>>>>>>>>>> > >>>>>>>>>>> On Oct 8, 2013, at 2:24 PM, Darren Shepherd > >>>>>>>> <darren.s.sheph...@gmail.com> wrote: > >>>>>>>>>>> > >>>>>>>>>>>> So in the implementation, when we say "quiesce" is that > actually > >>>>>>>>>>>> being implemented as a VM snapshot (memory and disk). And > then > >>>>>>>>>>>> when you say "unquiesce" you are talking about deleting the VM > >>>>>>>> snapshot? > >>>>>>>>>>> > >>>>>>>>>>> If the VM snapshot is not going to the hypervisor, then yes, it > >>>>> will > >>>>>>>> actually be a hypervisor snapshot. Just to be clear, the unquiesce > >> is > >>>>>>> not quite > >>>>>>>> a delete - it is a collapse of the VM snapshot and the active VM > >> back > >>>>>>> into one > >>>>>>>> file. > >>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> In NetApp, what are you snapshotting? The whole netapp volume > >>>>> (I > >>>>>>>>>>>> don't know the correct term), a file on NFS, an iscsi volume? > I > >>>>>>>>>>>> don't know a whole heck of a lot about the netapp snapshot > >>>>>>>> capabilities. > >>>>>>>>>>> > >>>>>>>>>>> Essentially we are using internal APIs to create file level > >>>>> backups > >>>>>>> - don't > >>>>>>>> worry too much about the terminology. > >>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> I know storage solutions can snapshot better and faster than > >>>>>>>>>>>> hypervisors can with COW files. I've personally just been > >>>>> always > >>>>>>>>>>>> perplexed on whats the best way to implement it. For storage > >>>>>>>>>>>> solutions that are block based, its really easy to have the > >>>>> storage > >>>>>>>>>>>> doing the snapshot. For shared file systems, like NFS, its > >>>>> seems > >>>>>>>>>>>> way more complicated as you don't want to snapshot the entire > >>>>>>>>>>>> filesystem in order to snapshot one file. > >>>>>>>>>>> > >>>>>>>>>>> With filesystems like NFS, things are certainly more > complicated, > >>>>>>> but that > >>>>>>>> is taken care of by our controller's operating system, Data ONTAP, > >>>>> and we > >>>>>>>> simply use APIs to communicate with it. > >>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> Darren > >>>>>>>>>>>> > >>>>>>>>>>>> On Tue, Oct 8, 2013 at 11:10 AM, SuichII, Christopher > >>>>>>>>>>>> <chris.su...@netapp.com> wrote: > >>>>>>>>>>>>> I can comment on the second half. > >>>>>>>>>>>>> > >>>>>>>>>>>>> Through storage operations, storage providers can create > >>>>> backups > >>>>>>>> much faster than hypervisors and over time, their snapshots are > more > >>>>>>>> efficient than the snapshot chains that hypervisors create. It is > >>>>> true > >>>>>>> that a VM > >>>>>>>> snapshot taken at the storage level is slightly different as it > >>>>> would be > >>>>>>> psuedo- > >>>>>>>> quiesced, not have it's memory snapshotted. This is accomplished > >>>>> through > >>>>>>>> hypervisor snapshots: > >>>>>>>>>>>>> > >>>>>>>>>>>>> 1) VM snapshot request (lets say VM 'A' > >>>>>>>>>>>>> 2) Create hypervisor snapshot (optional) -VM 'A' is > >>>>> snapshotted, > >>>>>>>>>>>>> creating active VM 'A*' > >>>>>>>>>>>>> -All disk traffic now goes to VM 'A*' and A is a snapshot of > >>>>> 'A*' > >>>>>>>>>>>>> 3) Storage driver(s) take snapshots of each volume > >>>>>>>>>>>>> 4) Undo hypervisor snapshot (optional) -VM snapshot 'A' is > >>>>> rolled > >>>>>>>>>>>>> back into VM 'A*' so the hypervisor snapshot no longer exists > >>>>>>>>>>>>> > >>>>>>>>>>>>> Now, a couple notes: > >>>>>>>>>>>>> -The reason this is optional is that not all users > necessarily > >>>>>>> care about > >>>>>>>> the memory or disk consistency of their VMs and would prefer > faster > >>>>>>>> snapshots to consistency. > >>>>>>>>>>>>> -Preemptively, yes, we are actually taking hypervisor > snapshots > >>>>>>> which > >>>>>>>> means there isn't actually a performance of taking storage > snapshots > >>>>> when > >>>>>>>> quiescing the VM. However, the performance gain will come both > >> during > >>>>>>>> restoring the VM and during normal operations as described above. > >>>>>>>>>>>>> > >>>>>>>>>>>>> Although you can think of it as a poor man's VM snapshot, I > >>>>> would > >>>>>>>> think of it more as a consistent multi-volume snapshot. Again, the > >>>>>>> difference > >>>>>>>> being that this snapshot was not truly quiesced like a hypervisor > >>>>>>> snapshot > >>>>>>>> would be. > >>>>>>>>>>>>> > >>>>>>>>>>>>> -- > >>>>>>>>>>>>> Chris Suich > >>>>>>>>>>>>> chris.su...@netapp.com > >>>>>>>>>>>>> NetApp Software Engineer > >>>>>>>>>>>>> Data Center Platforms - Cloud Solutions Citrix, Cisco & Red > Hat > >>>>>>>>>>>>> > >>>>>>>>>>>>> On Oct 8, 2013, at 1:47 PM, Darren Shepherd > >>>>>>>> <darren.s.sheph...@gmail.com> wrote: > >>>>>>>>>>>>> > >>>>>>>>>>>>>> My only comment is that having the return type as boolean > and > >>>>>>>>>>>>>> using to that indicate quiesce behaviour seems obscure and > >>>>> will > >>>>>>>>>>>>>> probably lead to a problem later. Your basically saying the > >>>>>>>>>>>>>> result of the takeVMSnapshot will only ever need to > >>>>> communicate > >>>>>>>>>>>>>> back whether unquiesce needs to happen. Maybe some result > >>>>>>>> object > >>>>>>>>>>>>>> would be more extensible. > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> Actually, I think I have more comments. This seems a bit > odd > >>>>> to > >>>>>>> me. > >>>>>>>>>>>>>> Why would a storage driver in ACS implement a VM snapshot > >>>>>>>>>>>>>> functionality? VM snapshot is a really a hypervisor > >>>>> orchestrated > >>>>>>>>>>>>>> operation. So it seems like were trying to implement a poor > >>>>> mans > >>>>>>>>>>>>>> VM snapshot. Maybe if I understood what NetApp was trying > to > >>>>> do > >>>>>>>>>>>>>> it would make more sense, but its all odd. To do a proper > VM > >>>>>>>>>>>>>> snapshot you need to snapshot memory and disk at the exact > >>>>> same > >>>>>>>>>>>>>> time. How are we going to do that if ACS is orchestrating > >>>>> the VM > >>>>>>>>>>>>>> snapshot and delegating to storage providers. Its not like > >>>>> you > >>>>>>>>>>>>>> are going to pause the VM.... or are you? > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> Darren > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> On Mon, Oct 7, 2013 at 11:59 AM, Edison Su < > >>>>> edison...@citrix.com> > >>>>>>>> wrote: > >>>>>>>>>>>>>>> I created a design document page at > >>>>>>>> > >>>>> > https://cwiki.apache.org/confluence/display/CLOUDSTACK/Pluggable+VM+s > >>>>>>>> napshot+related+operations, feel free to add items on it. > >>>>>>>>>>>>>>> And a new branch "pluggable_vm_snapshot" is created. > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> -----Original Message----- > >>>>>>>>>>>>>>>> From: SuichII, Christopher [mailto:chris.su...@netapp.com > ] > >>>>>>>>>>>>>>>> Sent: Monday, October 07, 2013 10:02 AM > >>>>>>>>>>>>>>>> To: <dev@cloudstack.apache.org> > >>>>>>>>>>>>>>>> Subject: Re: [DISCUSS] Pluggable VM snapshot related > >>>>> operations? > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> I'm a fan of option 2 - this gives us the most flexibility > >>>>> (as > >>>>>>>>>>>>>>>> you stated). The option is given to completely override > the > >>>>> way > >>>>>>>>>>>>>>>> VM snapshots work AND storage providers are given to > >>>>>>>>>>>>>>>> opportunity to work within the default VM snapshot > workflow. > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> I believe this option should satisfy your concern, Mike. > The > >>>>>>>>>>>>>>>> snapshot and quiesce strategy would be in charge of > >>>>>>>> communicating with the hypervisor. > >>>>>>>>>>>>>>>> Storage providers should be able to leverage the default > >>>>>>>>>>>>>>>> strategies and simply perform the storage operations. > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> I don't think it should be much of an issue that new > method > >>>>> to > >>>>>>>>>>>>>>>> the storage driver interface may not apply to everyone. In > >>>>> fact, > >>>>>>>> that is already the case. > >>>>>>>>>>>>>>>> Some methods such as un/maintain(), attachToXXX() and > >>>>>>>>>>>>>>>> takeSnapshot() are already not implemented by every > driver - > >>>>>>>>>>>>>>>> they just return false when asked if they can handle the > >>>>>>> operation. > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> -- > >>>>>>>>>>>>>>>> Chris Suich > >>>>>>>>>>>>>>>> chris.su...@netapp.com > >>>>>>>>>>>>>>>> NetApp Software Engineer > >>>>>>>>>>>>>>>> Data Center Platforms - Cloud Solutions Citrix, Cisco & > Red > >>>>> Hat > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> On Oct 5, 2013, at 12:11 AM, Mike Tutkowski > >>>>>>>>>>>>>>>> <mike.tutkow...@solidfire.com> > >>>>>>>>>>>>>>>> wrote: > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> Well, my first thought on this is that the storage driver > >>>>>>>>>>>>>>>>> should not be telling the hypervisor to do anything. It > >>>>> should > >>>>>>>>>>>>>>>>> be responsible for creating/deleting volumes, snapshots, > >>>>> etc. > >>>>>>> on > >>>>>>>> its storage system only. > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> On Fri, Oct 4, 2013 at 5:57 PM, Edison Su < > >>>>>>> edison...@citrix.com> > >>>>>>>> wrote: > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> In 4.2, we added VM snapshot for Vmware/Xenserver. The > >>>>>>>>>>>>>>>>>> current workflow will be like the following: > >>>>>>>>>>>>>>>>>> createVMSnapshot api -> VMSnapshotManagerImpl: > >>>>>>>>>>>>>>>>>> creatVMSnapshot -> send CreateVMSnapshotCommand to > >>>>>>>> hypervisor to create vm snapshot. > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> If anybody wants to change the workflow, then need to > >>>>> either > >>>>>>>>>>>>>>>>>> change VMSnapshotManagerImpl directly or subclass > >>>>>>>> VMSnapshotManagerImpl. > >>>>>>>>>>>>>>>>>> Both are not the ideal choice, as VMSnapshotManagerImpl > >>>>>>>>>>>>>>>>>> should be able to handle different ways to take vm > >>>>> snapshot, > >>>>>>>> instead of hard code. > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> The requirements for the pluggable VM snapshot coming > >>>>> from: > >>>>>>>>>>>>>>>>>> Storage vendor may have their optimization, such as > >>>>> NetApp. > >>>>>>>>>>>>>>>>>> VM snapshot can be implemented in a totally different > >>>>> way(For > >>>>>>>>>>>>>>>>>> example, I could just send a command to guest VM, to > tell > >>>>> my > >>>>>>>>>>>>>>>>>> application to flush disk and hold disk write, then come > >>>>> to > >>>>>>>>>>>>>>>>>> hypervisor to > >>>>>>>>>>>>>>>> take a volume snapshot). > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> If we agree on enable pluggable VM snapshot, then we can > >>>>>>>> move > >>>>>>>>>>>>>>>>>> on discuss how to implement it. > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> The possible options: > >>>>>>>>>>>>>>>>>> 1. coarse grained interface. Add a VMSnapshotStrategy > >>>>>>>>>>>>>>>>>> interface, which has the following interfaces: > >>>>>>>>>>>>>>>>>> VMSnapshot takeVMSnapshot(VMSnapshot vmSnapshot); > >>>>>>>> Boolean > >>>>>>>>>>>>>>>>>> revertVMSnapshot(VMSnapshot vmSnapshot); Boolean > >>>>>>>>>>>>>>>>>> DeleteVMSnapshot(VMSnapshot vmSnapshot); > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> The work flow will be: createVMSnapshot api -> > >>>>>>>>>>>>>>>> VMSnapshotManagerImpl: > >>>>>>>>>>>>>>>>>> creatVMSnapshot -> VMSnapshotStrategy: takeVMSnapshot > >>>>>>>>>>>>>>>>>> VMSnapshotManagerImpl will manage VM state, do the > sanity > >>>>>>>>>>>>>>>>>> check, then will handle over to VMSnapshotStrategy. > >>>>>>>>>>>>>>>>>> In VMSnapshotStrategy implementation, it may just send a > >>>>>>>>>>>>>>>>>> Create/revert/delete VMSnapshotCommand to hypervisor > >>>>>>>> host, or > >>>>>>>>>>>>>>>>>> do anything special operations. > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> 2. fine-grained interface. Not only add a > >>>>> VMSnapshotStrategy > >>>>>>>>>>>>>>>>>> interface, but also add certain methods on the storage > >>>>> driver. > >>>>>>>>>>>>>>>>>> The VMSnapshotStrategy interface will be the same as > >>>>> option 1. > >>>>>>>>>>>>>>>>>> Will add the following methods on storage driver: > >>>>>>>>>>>>>>>>>> /* volumesBelongToVM is the list of volumes of the VM > >>>>> that > >>>>>>>>>>>>>>>>>> created on this storage, storage vendor can either take > >>>>> one > >>>>>>>>>>>>>>>>>> snapshot for this volumes in one shot, or take snapshot > >>>>> for > >>>>>>>> each volume separately > >>>>>>>>>>>>>>>>>> The pre-condition: vm is unquiesced. > >>>>>>>>>>>>>>>>>> It will return a Boolean to indicate, do need > >>>>> unquiesce vm > >>>>>>> or > >>>>>>>> not. > >>>>>>>>>>>>>>>>>> In the default storage driver, it will return false. > >>>>>>>>>>>>>>>>>> */ > >>>>>>>>>>>>>>>>>> boolean takeVMSnapshot(List<VolumeInfo> > >>>>>>>> volumesBelongToVM, > >>>>>>>>>>>>>>>>>> VMSnapshot vmSnapshot); Boolean > >>>>>>>>>>>>>>>>>> revertVMSnapshot(List<VolumeInfo> volumesBelongToVM, > >>>>>>>>>>>>>>>>>> VMSnapshot vmSnapshot); Boolean > >>>>>>>>>>>>>>>>>> deleteVMSnapshot(List<VolumeInfo> volumesBelongToVM, > >>>>>>>>>>>>>>>>>> VMSnapshot vmSNapshot); > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> The work flow will be: createVMSnapshot api -> > >>>>>>>>>>>>>>>> VMSnapshotManagerImpl: > >>>>>>>>>>>>>>>>>> creatVMSnapshot -> VMSnapshotStrategy: takeVMSnapshot -> > >>>>>>>>>>>>>>>>>> storage driver:takeVMSnapshot In the implementation of > >>>>>>>>>>>>>>>>>> VMSnapshotStrategy's takeVMSnapshot, the pseudo code > >>>>>>>> looks like: > >>>>>>>>>>>>>>>>>> HypervisorHelper.quiesceVM(vm); > >>>>>>>>>>>>>>>>>> val volumes = vm.getVolumes(); > >>>>>>>>>>>>>>>>>> val maps = new Map[driver, list[VolumeInfo]](); > >>>>>>>>>>>>>>>>>> Volumes.foreach(volume => maps.put(volume.getDriver, > >>>>>>>> volume :: > >>>>>>>>>>>>>>>>>> maps.get(volume.getdriver()))) > >>>>>>>>>>>>>>>>>> val needUnquiesce = true; > >>>>>>>>>>>>>>>>>> maps.foreach((driver, volumes) => needUnquiesce = > >>>>>>>>>>>>>>>>>> needUnquiesce && driver.takeVMSnapshot(volumes)) > >>>>>>>>>>>>>>>>>> if (needUnquiesce ) { > >>>>>>>>>>>>>>>>>> HypervisorHelper.unquiesce(vm); } > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> By default, the quiesceVM in HypervisorHelper will > >>>>> actually > >>>>>>>>>>>>>>>>>> take vm snapshot through hypervisor. > >>>>>>>>>>>>>>>>>> Does above logic makes senesce? > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> The pros of option 1 is that: it's simple, no need to > >>>>> change > >>>>>>>>>>>>>>>>>> storage driver interfaces. The cons is that each storage > >>>>>>>>>>>>>>>>>> vendor need to implement a strategy, maybe they will do > >>>>> the > >>>>>>>> same thing. > >>>>>>>>>>>>>>>>>> The pros of option 2 is that, storage driver won't need > to > >>>>>>>>>>>>>>>>>> worry about how to quiesce/unquiesce vm. The cons is > >>>>> that, it > >>>>>>>>>>>>>>>>>> will add these methods on each storage drivers, so it > >>>>> assumes > >>>>>>>>>>>>>>>>>> that this work flow will work for everybody. > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> So which option we should take? Or if you have other > >>>>> options, > >>>>>>>>>>>>>>>>>> please let's know. > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> -- > >>>>>>>>>>>>>>>>> *Mike Tutkowski* > >>>>>>>>>>>>>>>>> *Senior CloudStack Developer, SolidFire Inc.* > >>>>>>>>>>>>>>>>> e: mike.tutkow...@solidfire.com > >>>>>>>>>>>>>>>>> o: 303.746.7302 > >>>>>>>>>>>>>>>>> Advancing the way the world uses the > >>>>>>>>>>>>>>>>> cloud<http://solidfire.com/solution/overview/?video=play > > > >>>>>>>>>>>>>>>>> *(tm)* > >>>>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>> > >>>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> -- > >>>>>> *Mike Tutkowski* > >>>>>> *Senior CloudStack Developer, SolidFire Inc.* > >>>>>> e: mike.tutkow...@solidfire.com > >>>>>> o: 303.746.7302 > >>>>>> Advancing the way the world uses the > >>>>>> cloud<http://solidfire.com/solution/overview/?video=play> > >>>>>> *™* > >>>>> > >>>> > >>>> > >>>> > >>>> -- > >>>> *Mike Tutkowski* > >>>> *Senior CloudStack Developer, SolidFire Inc.* > >>>> e: mike.tutkow...@solidfire.com > >>>> o: 303.746.7302 > >>>> Advancing the way the world uses the cloud< > >> http://solidfire.com/solution/overview/?video=play> > >>>> *™* > >>>> > >>> > >>> > >>> > >>> -- > >>> *Mike Tutkowski* > >>> *Senior CloudStack Developer, SolidFire Inc.* > >>> e: mike.tutkow...@solidfire.com > >>> o: 303.746.7302 > >>> Advancing the way the world uses the > >>> cloud<http://solidfire.com/solution/overview/?video=play> > >>> *™* > >> > >> > > > > > > -- > > *Mike Tutkowski* > > *Senior CloudStack Developer, SolidFire Inc.* > > e: mike.tutkow...@solidfire.com > > o: 303.746.7302 > > Advancing the way the world uses the > > cloud<http://solidfire.com/solution/overview/?video=play> > > *™* > > -- *Mike Tutkowski* *Senior CloudStack Developer, SolidFire Inc.* e: mike.tutkow...@solidfire.com o: 303.746.7302 Advancing the way the world uses the cloud<http://solidfire.com/solution/overview/?video=play> *™*