[ovirt-devel] [VDSM] Live snapshot with ceph disks

2015-06-19 Thread Nir Soffer
Hi all,

For 3.6, we will not support live vm snapshot, but this is a must for the next
release.

It is trivial to create a disk snapshot in ceph (using cinder apis). The 
snapshot
is transparent to libvirt, qmeu and the guest os.

However, we want to create a consistent snapshot, so you can revert to the disk
snapshot and get a consistent file system state.

We also want to create a complete vm snapshot, including all disks and vm 
memory.
Libvirt and qemu provides that when given a new disk for the active layer, but
when using ceph disk, we don't change the active layer - we continue to use the
same disk.

Since 1.2.5, libvirt provides virDomainFSFreeze and virDomainFSThaw:
https://libvirt.org/hvsupport.html

So here is possible flows (ignoring engine side stuff like locking vms and 
disks)

Disk snapshot
-

1. Engine invoke VM.freezeFileSystems
2. Vdsm invokes libvirt.virDomainFSFreeze
3. Engine creates snapshot via cinder
4. Engine invokes VM.thawFileSystems
5. Vdsm invokes livbirt.virDomainFSThaw

Vm snapshot
---

1. Engine invoke VM.freezeFileSystems
2. Vdsm invokes libvirt.virDomainFSFreeze
3. Engine creates snapshot via cinder
4. Engine invokes VM.snapshot
5. Vdsm creates snapshot, skipping ceph disks
6. Engine invokes VM.thawFileSystems
7. Vdsm invokes livbirt.virDomainFSThaw

API changes
---

New verbs:
- VM.freezeFileSystems - basically invokes virDomainFSFreeze
- VM.thawFileSystems - basically invokes virDomainFSThaw


What do you think?

Nir
___
Devel mailing list
Devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/devel


Re: [ovirt-devel] [VDSM] Live snapshot with ceph disks

2015-06-19 Thread Christopher Pereira
Hi Nir,

Regarding "3. Engine creates snapshot *via cinder*"...

What are the benefits of creating snapshots via cinder vs via libvirt?

Libvirt and qemu are offering core VM-aware storage and memory snapshot 
features.
Besides, snapshot-create-as has no VM downtime.
It would be a mistake to implement snapshoting on the ceph layer.
At some point, you would need VM-aware code (eg: the VM memory state) and 
organically go back to the libvirt + qemu way.
There seems to be qemu + libvirt support for ceph snapshots (via rbd commands) 
which probably offers some (?) VM-awareness, but what are the benefits of not 
using the good old core libvirt + qemu snapshot features?
I must be missing something...

2) Not related:

It seems like oVirt shifted focus towards Ceph recently...

I would like to drop Gluster for Ceph if the latter supports SEEK HOLE reading 
and optimal sparse files operations. Can someone please confirm if Ceph is 
supporting SEEK_HOLE? I saw some related code, but would like to ask for 
comments before setting up and benchmarking Ceph sparse image file operations.


___
Devel mailing list
Devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/devel


Re: [ovirt-devel] [VDSM] Live snapshot with ceph disks

2015-06-20 Thread Nir Soffer
- Original Message -
> From: "Christopher Pereira" 
> To: "Nir Soffer" , devel@ovirt.org
> Cc: "Eric Blake" 
> Sent: Saturday, June 20, 2015 9:34:57 AM
> Subject: Re: [ovirt-devel] [VDSM] Live snapshot with ceph disks
> 
> Hi Nir,
> 
> Regarding "3. Engine creates snapshot *via cinder*"...
> 
> What are the benefits of creating snapshots via cinder vs via libvirt?

Ceph provides thin provisioning and snapshot in the server side, which
is more efficient and simpler to use.

ceph disk use raw format, so we cannot use qcow based snapshots.

> Libvirt and qemu are offering core VM-aware storage and memory snapshot
> features.
> Besides, snapshot-create-as has no VM downtime.

We don't plan to introduce downtime.

> It would be a mistake to implement snapshoting on the ceph layer.
> At some point, you would need VM-aware code (eg: the VM memory state) and
> organically go back to the libvirt + qemu way.

We will use libvirt to create memory snapshot, stored in ceph disks instead
of a vdsm image.

> There seems to be qemu + libvirt support for ceph snapshots (via rbd
> commands) which probably offers some (?) VM-awareness, but what are the
> benefits of not using the good old core libvirt + qemu snapshot features?
> I must be missing something...

We want to support smart storage servers, offloading storage operations
to the server. We also want to leverage the rich echo system of cinder,
supported by many storage vendors.

So engine is creating ceph volumes and snapshots via cinder, and vdsm is
consuming the volumes via libvirt/qemu network disk support.

> 2) Not related:
> 
> It seems like oVirt shifted focus towards Ceph recently...
> 
> I would like to drop Gluster for Ceph if the latter supports SEEK HOLE
> reading and optimal sparse files operations. Can someone please confirm if
> Ceph is supporting SEEK_HOLE? I saw some related code, but would like to ask
> for comments before setting up and benchmarking Ceph sparse image file
> operations.

Ceph provides block storage, and gluster provides file-based storage. We are
focused on providing both options so user can choose what works best.

Nir
___
Devel mailing list
Devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/devel


Re: [ovirt-devel] [VDSM] Live snapshot with ceph disks

2015-06-21 Thread Daniel Erez


- Original Message -
> From: "Nir Soffer" 
> To: "devel" 
> Cc: "Eric Blake" , "Daniel Erez" , 
> "Francesco Romani" ,
> "Adam Litke" , "Federico Simoncelli" 
> , "Yaniv Dary" 
> Sent: Friday, June 19, 2015 11:40:23 PM
> Subject: [VDSM] Live snapshot with ceph disks
> 
> Hi all,
> 
> For 3.6, we will not support live vm snapshot, but this is a must for the
> next
> release.
> 
> It is trivial to create a disk snapshot in ceph (using cinder apis). The
> snapshot
> is transparent to libvirt, qmeu and the guest os.
> 
> However, we want to create a consistent snapshot, so you can revert to the
> disk
> snapshot and get a consistent file system state.
> 
> We also want to create a complete vm snapshot, including all disks and vm
> memory.
> Libvirt and qemu provides that when given a new disk for the active layer,
> but
> when using ceph disk, we don't change the active layer - we continue to use
> the
> same disk.
> 
> Since 1.2.5, libvirt provides virDomainFSFreeze and virDomainFSThaw:
> https://libvirt.org/hvsupport.html
> 
> So here is possible flows (ignoring engine side stuff like locking vms and
> disks)
> 
> Disk snapshot
> -
> 
> 1. Engine invoke VM.freezeFileSystems
> 2. Vdsm invokes libvirt.virDomainFSFreeze
> 3. Engine creates snapshot via cinder
> 4. Engine invokes VM.thawFileSystems
> 5. Vdsm invokes livbirt.virDomainFSThaw
> 
> Vm snapshot
> ---
> 
> 1. Engine invoke VM.freezeFileSystems
> 2. Vdsm invokes libvirt.virDomainFSFreeze
> 3. Engine creates snapshot via cinder
> 4. Engine invokes VM.snapshot
> 5. Vdsm creates snapshot, skipping ceph disks
> 6. Engine invokes VM.thawFileSystems
> 7. Vdsm invokes livbirt.virDomainFSThaw
> 
> API changes
> ---
> 
> New verbs:
> - VM.freezeFileSystems - basically invokes virDomainFSFreeze
> - VM.thawFileSystems - basically invokes virDomainFSThaw
> 
> 
> What do you think?

OpenStack uses two different approaches for live snapshot:
1. When taking a snapshot of an instance, a new image (of the entire
   instance) is created on Glance in qcow2 format - orchestrated by
   Nova and libvirt (Snapshot xml).
2. When taking a snapshot of a single volume while the VM is running
   (i.e. the volume is attached to an instance), the snapshot is taken
   using Cinder with the relevant driver. The following message is displayed
   in Horizon: "This volume is currently attached to an instance.
   In some cases, creating a snapshot from an attached volume can result
   in a corrupted snapshot." (see attached screenshot)

Since the current integration is directly with Cinder and VM snapshot
is handled by oVirt engine, we should go with a variant of option 2...
If it's too late to integrate the new verbs into 3.6, maybe we could just
settle with a similar warning when creating a live snapshot? Or block
the feature for now to avoid possible data inconsistency?

> 
> Nir
> ___
Devel mailing list
Devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/devel

Re: [ovirt-devel] [VDSM] Live snapshot with ceph disks

2015-06-21 Thread Nir Soffer


- Original Message -
> From: "Daniel Erez" 
> To: "Nir Soffer" 
> Cc: "devel" , "Eric Blake" , "Francesco 
> Romani" , "Adam
> Litke" , "Federico Simoncelli" , 
> "Yaniv Dary" 
> Sent: Sunday, June 21, 2015 10:05:41 AM
> Subject: Re: [VDSM] Live snapshot with ceph disks
> 
> 
> 
> - Original Message -
> > From: "Nir Soffer" 
> > To: "devel" 
> > Cc: "Eric Blake" , "Daniel Erez" ,
> > "Francesco Romani" ,
> > "Adam Litke" , "Federico Simoncelli"
> > , "Yaniv Dary" 
> > Sent: Friday, June 19, 2015 11:40:23 PM
> > Subject: [VDSM] Live snapshot with ceph disks
> > 
> > Hi all,
> > 
> > For 3.6, we will not support live vm snapshot, but this is a must for the
> > next
> > release.
> > 
> > It is trivial to create a disk snapshot in ceph (using cinder apis). The
> > snapshot
> > is transparent to libvirt, qmeu and the guest os.
> > 
> > However, we want to create a consistent snapshot, so you can revert to the
> > disk
> > snapshot and get a consistent file system state.
> > 
> > We also want to create a complete vm snapshot, including all disks and vm
> > memory.
> > Libvirt and qemu provides that when given a new disk for the active layer,
> > but
> > when using ceph disk, we don't change the active layer - we continue to use
> > the
> > same disk.
> > 
> > Since 1.2.5, libvirt provides virDomainFSFreeze and virDomainFSThaw:
> > https://libvirt.org/hvsupport.html
> > 
> > So here is possible flows (ignoring engine side stuff like locking vms and
> > disks)
> > 
> > Disk snapshot
> > -
> > 
> > 1. Engine invoke VM.freezeFileSystems
> > 2. Vdsm invokes libvirt.virDomainFSFreeze
> > 3. Engine creates snapshot via cinder
> > 4. Engine invokes VM.thawFileSystems
> > 5. Vdsm invokes livbirt.virDomainFSThaw
> > 
> > Vm snapshot
> > ---
> > 
> > 1. Engine invoke VM.freezeFileSystems
> > 2. Vdsm invokes libvirt.virDomainFSFreeze
> > 3. Engine creates snapshot via cinder
> > 4. Engine invokes VM.snapshot
> > 5. Vdsm creates snapshot, skipping ceph disks
> > 6. Engine invokes VM.thawFileSystems
> > 7. Vdsm invokes livbirt.virDomainFSThaw
> > 
> > API changes
> > ---
> > 
> > New verbs:
> > - VM.freezeFileSystems - basically invokes virDomainFSFreeze
> > - VM.thawFileSystems - basically invokes virDomainFSThaw
> > 
> > 
> > What do you think?
> 
> OpenStack uses two different approaches for live snapshot:
> 1. When taking a snapshot of an instance, a new image (of the entire
>instance) is created on Glance in qcow2 format - orchestrated by
>Nova and libvirt (Snapshot xml).

This does not sound like compatible solution like snaphost using vdsm images.

> 2. When taking a snapshot of a single volume while the VM is running
>(i.e. the volume is attached to an instance), the snapshot is taken
>using Cinder with the relevant driver. The following message is displayed
>in Horizon: "This volume is currently attached to an instance.
>In some cases, creating a snapshot from an attached volume can result
>in a corrupted snapshot." (see attached screenshot)
> 
> Since the current integration is directly with Cinder and VM snapshot
> is handled by oVirt engine, we should go with a variant of option 2...

We support consistent snapshots with vdsm images, and should support
consistent snapshots with ceph as well.

> If it's too late to integrate the new verbs into 3.6, maybe we could just
> settle with a similar warning when creating a live snapshot? Or block
> the feature for now to avoid possible data inconsistency?

I think we plan live snapshot for next release, not 3.6, so we can add any
api we need.

I think our goal should be similar live snapshot as we have with other
storage types. Do we agree on this goal?

Nir
___
Devel mailing list
Devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/devel


Re: [ovirt-devel] [VDSM] Live snapshot with ceph disks

2015-06-21 Thread Yaniv Dary



On 06/21/2015 11:07 AM, Nir Soffer wrote:


- Original Message -

From: "Daniel Erez" 
To: "Nir Soffer" 
Cc: "devel" , "Eric Blake" , "Francesco Romani" 
, "Adam
Litke" , "Federico Simoncelli" , "Yaniv Dary" 

Sent: Sunday, June 21, 2015 10:05:41 AM
Subject: Re: [VDSM] Live snapshot with ceph disks



- Original Message -

From: "Nir Soffer" 
To: "devel" 
Cc: "Eric Blake" , "Daniel Erez" ,
"Francesco Romani" ,
"Adam Litke" , "Federico Simoncelli"
, "Yaniv Dary" 
Sent: Friday, June 19, 2015 11:40:23 PM
Subject: [VDSM] Live snapshot with ceph disks

Hi all,

For 3.6, we will not support live vm snapshot, but this is a must for the
next
release.

It is trivial to create a disk snapshot in ceph (using cinder apis). The
snapshot
is transparent to libvirt, qmeu and the guest os.

However, we want to create a consistent snapshot, so you can revert to the
disk
snapshot and get a consistent file system state.

We also want to create a complete vm snapshot, including all disks and vm
memory.
Libvirt and qemu provides that when given a new disk for the active layer,
but
when using ceph disk, we don't change the active layer - we continue to use
the
same disk.

Since 1.2.5, libvirt provides virDomainFSFreeze and virDomainFSThaw:
https://libvirt.org/hvsupport.html

So here is possible flows (ignoring engine side stuff like locking vms and
disks)

Disk snapshot
-

1. Engine invoke VM.freezeFileSystems
2. Vdsm invokes libvirt.virDomainFSFreeze
3. Engine creates snapshot via cinder
4. Engine invokes VM.thawFileSystems
5. Vdsm invokes livbirt.virDomainFSThaw

Vm snapshot
---

1. Engine invoke VM.freezeFileSystems
2. Vdsm invokes libvirt.virDomainFSFreeze
3. Engine creates snapshot via cinder
4. Engine invokes VM.snapshot
5. Vdsm creates snapshot, skipping ceph disks
6. Engine invokes VM.thawFileSystems
7. Vdsm invokes livbirt.virDomainFSThaw

API changes
---

New verbs:
- VM.freezeFileSystems - basically invokes virDomainFSFreeze
- VM.thawFileSystems - basically invokes virDomainFSThaw


What do you think?

OpenStack uses two different approaches for live snapshot:
1. When taking a snapshot of an instance, a new image (of the entire
instance) is created on Glance in qcow2 format - orchestrated by
Nova and libvirt (Snapshot xml).

This does not sound like compatible solution like snaphost using vdsm images.


2. When taking a snapshot of a single volume while the VM is running
(i.e. the volume is attached to an instance), the snapshot is taken
using Cinder with the relevant driver. The following message is displayed
in Horizon: "This volume is currently attached to an instance.
In some cases, creating a snapshot from an attached volume can result
in a corrupted snapshot." (see attached screenshot)

Since the current integration is directly with Cinder and VM snapshot
is handled by oVirt engine, we should go with a variant of option 2...

We support consistent snapshots with vdsm images, and should support
consistent snapshots with ceph as well.


If it's too late to integrate the new verbs into 3.6, maybe we could just
settle with a similar warning when creating a live snapshot? Or block
the feature for now to avoid possible data inconsistency?

I think we plan live snapshot for next release, not 3.6, so we can add any
api we need.

I think our goal should be similar live snapshot as we have with other
storage types. Do we agree on this goal?


Ack on that, we should remember we can have a VM with both Cinder and 
engine managed disks.




Nir


--
Yaniv Dary
Technical Product Manager
Red Hat Israel Ltd.
34 Jerusalem Road
Building A, 4th floor
Ra'anana, Israel 4350109

Tel : +972 (9) 7692306
  8272306
Email: yd...@redhat.com
IRC : ydary

___
Devel mailing list
Devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/devel


Re: [ovirt-devel] [VDSM] Live snapshot with ceph disks

2015-06-25 Thread Michal Skrivanek

On 19 Jun 2015, at 22:40, Nir Soffer wrote:

> Hi all,
> 
> For 3.6, we will not support live vm snapshot, but this is a must for the next
> release.
> 
> It is trivial to create a disk snapshot in ceph (using cinder apis). The 
> snapshot
> is transparent to libvirt, qmeu and the guest os.
> 
> However, we want to create a consistent snapshot, so you can revert to the 
> disk
> snapshot and get a consistent file system state.
> 
> We also want to create a complete vm snapshot, including all disks and vm 
> memory.
> Libvirt and qemu provides that when given a new disk for the active layer, but
> when using ceph disk, we don't change the active layer - we continue to use 
> the
> same disk.
> 
> Since 1.2.5, libvirt provides virDomainFSFreeze and virDomainFSThaw:
> https://libvirt.org/hvsupport.html
> 
> So here is possible flows (ignoring engine side stuff like locking vms and 
> disks)
> 
> Disk snapshot
> -
> 
> 1. Engine invoke VM.freezeFileSystems
> 2. Vdsm invokes libvirt.virDomainFSFreeze
> 3. Engine creates snapshot via cinder
> 4. Engine invokes VM.thawFileSystems
> 5. Vdsm invokes livbirt.virDomainFSThaw
> 
> Vm snapshot
> ---
> 
> 1. Engine invoke VM.freezeFileSystems
> 2. Vdsm invokes libvirt.virDomainFSFreeze
> 3. Engine creates snapshot via cinder
> 4. Engine invokes VM.snapshot
> 5. Vdsm creates snapshot, skipping ceph disks
> 6. Engine invokes VM.thawFileSystems
> 7. Vdsm invokes livbirt.virDomainFSThaw
> 
> API changes
> ---
> 
> New verbs:
> - VM.freezeFileSystems - basically invokes virDomainFSFreeze
> - VM.thawFileSystems - basically invokes virDomainFSThaw

once we do it explicitly we can drop the flag from libvirt api which does it 
"atomically" for us right now
also note the dependency on functional qemu-ga (that's no different from today, 
but the current behavior is that when qemu-ga is not running we are quietly 
doing an unsafe snapshot)

> 
> 
> What do you think?
> 
> Nir
> ___
> Devel mailing list
> Devel@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/devel

___
Devel mailing list
Devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/devel


Re: [ovirt-devel] [VDSM] Live snapshot with ceph disks

2015-06-25 Thread Nir Soffer
- Original Message -
> From: "Michal Skrivanek" 
> To: "Nir Soffer" 
> Cc: "devel" , "Eric Blake" 
> Sent: Thursday, June 25, 2015 3:44:59 PM
> Subject: Re: [ovirt-devel] [VDSM] Live snapshot with ceph disks
> 
> 
> On 19 Jun 2015, at 22:40, Nir Soffer wrote:
> 
> > Hi all,
> > 
> > For 3.6, we will not support live vm snapshot, but this is a must for the
> > next
> > release.
> > 
> > It is trivial to create a disk snapshot in ceph (using cinder apis). The
> > snapshot
> > is transparent to libvirt, qmeu and the guest os.
> > 
> > However, we want to create a consistent snapshot, so you can revert to the
> > disk
> > snapshot and get a consistent file system state.
> > 
> > We also want to create a complete vm snapshot, including all disks and vm
> > memory.
> > Libvirt and qemu provides that when given a new disk for the active layer,
> > but
> > when using ceph disk, we don't change the active layer - we continue to use
> > the
> > same disk.
> > 
> > Since 1.2.5, libvirt provides virDomainFSFreeze and virDomainFSThaw:
> > https://libvirt.org/hvsupport.html
> > 
> > So here is possible flows (ignoring engine side stuff like locking vms and
> > disks)
> > 
> > Disk snapshot
> > -
> > 
> > 1. Engine invoke VM.freezeFileSystems
> > 2. Vdsm invokes libvirt.virDomainFSFreeze
> > 3. Engine creates snapshot via cinder
> > 4. Engine invokes VM.thawFileSystems
> > 5. Vdsm invokes livbirt.virDomainFSThaw
> > 
> > Vm snapshot
> > ---
> > 
> > 1. Engine invoke VM.freezeFileSystems
> > 2. Vdsm invokes libvirt.virDomainFSFreeze
> > 3. Engine creates snapshot via cinder
> > 4. Engine invokes VM.snapshot
> > 5. Vdsm creates snapshot, skipping ceph disks
> > 6. Engine invokes VM.thawFileSystems
> > 7. Vdsm invokes livbirt.virDomainFSThaw
> > 
> > API changes
> > ---
> > 
> > New verbs:
> > - VM.freezeFileSystems - basically invokes virDomainFSFreeze
> > - VM.thawFileSystems - basically invokes virDomainFSThaw
> 
> once we do it explicitly we can drop the flag from libvirt api which does it
> "atomically" for us right now

We plan to add a "frozen" key in the snapshot description. If ture, it means
that engine freezed the file systems on the guest, and vdsm will skip the 
VIR_DOMAIN_SNAPSHOT_CREATE_QUIESCE flag.

> also note the dependency on functional qemu-ga (that's no different from
> today, but the current behavior is that when qemu-ga is not running we are
> quietly doing an unsafe snapshot)

We plan to keep this behavior, hopefully make it better, depending on
how libvirt communicate this condition.

Current code is assuming that *any* exception calling snapshotCreateXML()
means the guest is not running :-)

Nir
___
Devel mailing list
Devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/devel