Re: [ovirt-devel] [VDSM] Live snapshot with ceph disks
On 19 Jun 2015, at 22:40, Nir Soffer wrote: Hi all, For 3.6, we will not support live vm snapshot, but this is a must for the next release. It is trivial to create a disk snapshot in ceph (using cinder apis). The snapshot is transparent to libvirt, qmeu and the guest os. However, we want to create a consistent snapshot, so you can revert to the disk snapshot and get a consistent file system state. We also want to create a complete vm snapshot, including all disks and vm memory. Libvirt and qemu provides that when given a new disk for the active layer, but when using ceph disk, we don't change the active layer - we continue to use the same disk. Since 1.2.5, libvirt provides virDomainFSFreeze and virDomainFSThaw: https://libvirt.org/hvsupport.html So here is possible flows (ignoring engine side stuff like locking vms and disks) Disk snapshot - 1. Engine invoke VM.freezeFileSystems 2. Vdsm invokes libvirt.virDomainFSFreeze 3. Engine creates snapshot via cinder 4. Engine invokes VM.thawFileSystems 5. Vdsm invokes livbirt.virDomainFSThaw Vm snapshot --- 1. Engine invoke VM.freezeFileSystems 2. Vdsm invokes libvirt.virDomainFSFreeze 3. Engine creates snapshot via cinder 4. Engine invokes VM.snapshot 5. Vdsm creates snapshot, skipping ceph disks 6. Engine invokes VM.thawFileSystems 7. Vdsm invokes livbirt.virDomainFSThaw API changes --- New verbs: - VM.freezeFileSystems - basically invokes virDomainFSFreeze - VM.thawFileSystems - basically invokes virDomainFSThaw once we do it explicitly we can drop the flag from libvirt api which does it atomically for us right now also note the dependency on functional qemu-ga (that's no different from today, but the current behavior is that when qemu-ga is not running we are quietly doing an unsafe snapshot) What do you think? Nir ___ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel ___ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
Re: [ovirt-devel] [VDSM] Live snapshot with ceph disks
- Original Message - From: Michal Skrivanek michal.skriva...@redhat.com To: Nir Soffer nsof...@redhat.com Cc: devel devel@ovirt.org, Eric Blake ebl...@redhat.com Sent: Thursday, June 25, 2015 3:44:59 PM Subject: Re: [ovirt-devel] [VDSM] Live snapshot with ceph disks On 19 Jun 2015, at 22:40, Nir Soffer wrote: Hi all, For 3.6, we will not support live vm snapshot, but this is a must for the next release. It is trivial to create a disk snapshot in ceph (using cinder apis). The snapshot is transparent to libvirt, qmeu and the guest os. However, we want to create a consistent snapshot, so you can revert to the disk snapshot and get a consistent file system state. We also want to create a complete vm snapshot, including all disks and vm memory. Libvirt and qemu provides that when given a new disk for the active layer, but when using ceph disk, we don't change the active layer - we continue to use the same disk. Since 1.2.5, libvirt provides virDomainFSFreeze and virDomainFSThaw: https://libvirt.org/hvsupport.html So here is possible flows (ignoring engine side stuff like locking vms and disks) Disk snapshot - 1. Engine invoke VM.freezeFileSystems 2. Vdsm invokes libvirt.virDomainFSFreeze 3. Engine creates snapshot via cinder 4. Engine invokes VM.thawFileSystems 5. Vdsm invokes livbirt.virDomainFSThaw Vm snapshot --- 1. Engine invoke VM.freezeFileSystems 2. Vdsm invokes libvirt.virDomainFSFreeze 3. Engine creates snapshot via cinder 4. Engine invokes VM.snapshot 5. Vdsm creates snapshot, skipping ceph disks 6. Engine invokes VM.thawFileSystems 7. Vdsm invokes livbirt.virDomainFSThaw API changes --- New verbs: - VM.freezeFileSystems - basically invokes virDomainFSFreeze - VM.thawFileSystems - basically invokes virDomainFSThaw once we do it explicitly we can drop the flag from libvirt api which does it atomically for us right now We plan to add a frozen key in the snapshot description. If ture, it means that engine freezed the file systems on the guest, and vdsm will skip the VIR_DOMAIN_SNAPSHOT_CREATE_QUIESCE flag. also note the dependency on functional qemu-ga (that's no different from today, but the current behavior is that when qemu-ga is not running we are quietly doing an unsafe snapshot) We plan to keep this behavior, hopefully make it better, depending on how libvirt communicate this condition. Current code is assuming that *any* exception calling snapshotCreateXML() means the guest is not running :-) Nir ___ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
Re: [ovirt-devel] [VDSM] Live snapshot with ceph disks
- Original Message - From: Daniel Erez de...@redhat.com To: Nir Soffer nsof...@redhat.com Cc: devel devel@ovirt.org, Eric Blake ebl...@redhat.com, Francesco Romani from...@redhat.com, Adam Litke ali...@redhat.com, Federico Simoncelli fsimo...@redhat.com, Yaniv Dary yd...@redhat.com Sent: Sunday, June 21, 2015 10:05:41 AM Subject: Re: [VDSM] Live snapshot with ceph disks - Original Message - From: Nir Soffer nsof...@redhat.com To: devel devel@ovirt.org Cc: Eric Blake ebl...@redhat.com, Daniel Erez de...@redhat.com, Francesco Romani from...@redhat.com, Adam Litke ali...@redhat.com, Federico Simoncelli fsimo...@redhat.com, Yaniv Dary yd...@redhat.com Sent: Friday, June 19, 2015 11:40:23 PM Subject: [VDSM] Live snapshot with ceph disks Hi all, For 3.6, we will not support live vm snapshot, but this is a must for the next release. It is trivial to create a disk snapshot in ceph (using cinder apis). The snapshot is transparent to libvirt, qmeu and the guest os. However, we want to create a consistent snapshot, so you can revert to the disk snapshot and get a consistent file system state. We also want to create a complete vm snapshot, including all disks and vm memory. Libvirt and qemu provides that when given a new disk for the active layer, but when using ceph disk, we don't change the active layer - we continue to use the same disk. Since 1.2.5, libvirt provides virDomainFSFreeze and virDomainFSThaw: https://libvirt.org/hvsupport.html So here is possible flows (ignoring engine side stuff like locking vms and disks) Disk snapshot - 1. Engine invoke VM.freezeFileSystems 2. Vdsm invokes libvirt.virDomainFSFreeze 3. Engine creates snapshot via cinder 4. Engine invokes VM.thawFileSystems 5. Vdsm invokes livbirt.virDomainFSThaw Vm snapshot --- 1. Engine invoke VM.freezeFileSystems 2. Vdsm invokes libvirt.virDomainFSFreeze 3. Engine creates snapshot via cinder 4. Engine invokes VM.snapshot 5. Vdsm creates snapshot, skipping ceph disks 6. Engine invokes VM.thawFileSystems 7. Vdsm invokes livbirt.virDomainFSThaw API changes --- New verbs: - VM.freezeFileSystems - basically invokes virDomainFSFreeze - VM.thawFileSystems - basically invokes virDomainFSThaw What do you think? OpenStack uses two different approaches for live snapshot: 1. When taking a snapshot of an instance, a new image (of the entire instance) is created on Glance in qcow2 format - orchestrated by Nova and libvirt (Snapshot xml). This does not sound like compatible solution like snaphost using vdsm images. 2. When taking a snapshot of a single volume while the VM is running (i.e. the volume is attached to an instance), the snapshot is taken using Cinder with the relevant driver. The following message is displayed in Horizon: This volume is currently attached to an instance. In some cases, creating a snapshot from an attached volume can result in a corrupted snapshot. (see attached screenshot) Since the current integration is directly with Cinder and VM snapshot is handled by oVirt engine, we should go with a variant of option 2... We support consistent snapshots with vdsm images, and should support consistent snapshots with ceph as well. If it's too late to integrate the new verbs into 3.6, maybe we could just settle with a similar warning when creating a live snapshot? Or block the feature for now to avoid possible data inconsistency? I think we plan live snapshot for next release, not 3.6, so we can add any api we need. I think our goal should be similar live snapshot as we have with other storage types. Do we agree on this goal? Nir ___ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
Re: [ovirt-devel] [VDSM] Live snapshot with ceph disks
On 06/21/2015 11:07 AM, Nir Soffer wrote: - Original Message - From: Daniel Erez de...@redhat.com To: Nir Soffer nsof...@redhat.com Cc: devel devel@ovirt.org, Eric Blake ebl...@redhat.com, Francesco Romani from...@redhat.com, Adam Litke ali...@redhat.com, Federico Simoncelli fsimo...@redhat.com, Yaniv Dary yd...@redhat.com Sent: Sunday, June 21, 2015 10:05:41 AM Subject: Re: [VDSM] Live snapshot with ceph disks - Original Message - From: Nir Soffer nsof...@redhat.com To: devel devel@ovirt.org Cc: Eric Blake ebl...@redhat.com, Daniel Erez de...@redhat.com, Francesco Romani from...@redhat.com, Adam Litke ali...@redhat.com, Federico Simoncelli fsimo...@redhat.com, Yaniv Dary yd...@redhat.com Sent: Friday, June 19, 2015 11:40:23 PM Subject: [VDSM] Live snapshot with ceph disks Hi all, For 3.6, we will not support live vm snapshot, but this is a must for the next release. It is trivial to create a disk snapshot in ceph (using cinder apis). The snapshot is transparent to libvirt, qmeu and the guest os. However, we want to create a consistent snapshot, so you can revert to the disk snapshot and get a consistent file system state. We also want to create a complete vm snapshot, including all disks and vm memory. Libvirt and qemu provides that when given a new disk for the active layer, but when using ceph disk, we don't change the active layer - we continue to use the same disk. Since 1.2.5, libvirt provides virDomainFSFreeze and virDomainFSThaw: https://libvirt.org/hvsupport.html So here is possible flows (ignoring engine side stuff like locking vms and disks) Disk snapshot - 1. Engine invoke VM.freezeFileSystems 2. Vdsm invokes libvirt.virDomainFSFreeze 3. Engine creates snapshot via cinder 4. Engine invokes VM.thawFileSystems 5. Vdsm invokes livbirt.virDomainFSThaw Vm snapshot --- 1. Engine invoke VM.freezeFileSystems 2. Vdsm invokes libvirt.virDomainFSFreeze 3. Engine creates snapshot via cinder 4. Engine invokes VM.snapshot 5. Vdsm creates snapshot, skipping ceph disks 6. Engine invokes VM.thawFileSystems 7. Vdsm invokes livbirt.virDomainFSThaw API changes --- New verbs: - VM.freezeFileSystems - basically invokes virDomainFSFreeze - VM.thawFileSystems - basically invokes virDomainFSThaw What do you think? OpenStack uses two different approaches for live snapshot: 1. When taking a snapshot of an instance, a new image (of the entire instance) is created on Glance in qcow2 format - orchestrated by Nova and libvirt (Snapshot xml). This does not sound like compatible solution like snaphost using vdsm images. 2. When taking a snapshot of a single volume while the VM is running (i.e. the volume is attached to an instance), the snapshot is taken using Cinder with the relevant driver. The following message is displayed in Horizon: This volume is currently attached to an instance. In some cases, creating a snapshot from an attached volume can result in a corrupted snapshot. (see attached screenshot) Since the current integration is directly with Cinder and VM snapshot is handled by oVirt engine, we should go with a variant of option 2... We support consistent snapshots with vdsm images, and should support consistent snapshots with ceph as well. If it's too late to integrate the new verbs into 3.6, maybe we could just settle with a similar warning when creating a live snapshot? Or block the feature for now to avoid possible data inconsistency? I think we plan live snapshot for next release, not 3.6, so we can add any api we need. I think our goal should be similar live snapshot as we have with other storage types. Do we agree on this goal? Ack on that, we should remember we can have a VM with both Cinder and engine managed disks. Nir -- Yaniv Dary Technical Product Manager Red Hat Israel Ltd. 34 Jerusalem Road Building A, 4th floor Ra'anana, Israel 4350109 Tel : +972 (9) 7692306 8272306 Email: yd...@redhat.com IRC : ydary ___ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
Re: [ovirt-devel] [VDSM] Live snapshot with ceph disks
- Original Message - From: Nir Soffer nsof...@redhat.com To: devel devel@ovirt.org Cc: Eric Blake ebl...@redhat.com, Daniel Erez de...@redhat.com, Francesco Romani from...@redhat.com, Adam Litke ali...@redhat.com, Federico Simoncelli fsimo...@redhat.com, Yaniv Dary yd...@redhat.com Sent: Friday, June 19, 2015 11:40:23 PM Subject: [VDSM] Live snapshot with ceph disks Hi all, For 3.6, we will not support live vm snapshot, but this is a must for the next release. It is trivial to create a disk snapshot in ceph (using cinder apis). The snapshot is transparent to libvirt, qmeu and the guest os. However, we want to create a consistent snapshot, so you can revert to the disk snapshot and get a consistent file system state. We also want to create a complete vm snapshot, including all disks and vm memory. Libvirt and qemu provides that when given a new disk for the active layer, but when using ceph disk, we don't change the active layer - we continue to use the same disk. Since 1.2.5, libvirt provides virDomainFSFreeze and virDomainFSThaw: https://libvirt.org/hvsupport.html So here is possible flows (ignoring engine side stuff like locking vms and disks) Disk snapshot - 1. Engine invoke VM.freezeFileSystems 2. Vdsm invokes libvirt.virDomainFSFreeze 3. Engine creates snapshot via cinder 4. Engine invokes VM.thawFileSystems 5. Vdsm invokes livbirt.virDomainFSThaw Vm snapshot --- 1. Engine invoke VM.freezeFileSystems 2. Vdsm invokes libvirt.virDomainFSFreeze 3. Engine creates snapshot via cinder 4. Engine invokes VM.snapshot 5. Vdsm creates snapshot, skipping ceph disks 6. Engine invokes VM.thawFileSystems 7. Vdsm invokes livbirt.virDomainFSThaw API changes --- New verbs: - VM.freezeFileSystems - basically invokes virDomainFSFreeze - VM.thawFileSystems - basically invokes virDomainFSThaw What do you think? OpenStack uses two different approaches for live snapshot: 1. When taking a snapshot of an instance, a new image (of the entire instance) is created on Glance in qcow2 format - orchestrated by Nova and libvirt (Snapshot xml). 2. When taking a snapshot of a single volume while the VM is running (i.e. the volume is attached to an instance), the snapshot is taken using Cinder with the relevant driver. The following message is displayed in Horizon: This volume is currently attached to an instance. In some cases, creating a snapshot from an attached volume can result in a corrupted snapshot. (see attached screenshot) Since the current integration is directly with Cinder and VM snapshot is handled by oVirt engine, we should go with a variant of option 2... If it's too late to integrate the new verbs into 3.6, maybe we could just settle with a similar warning when creating a live snapshot? Or block the feature for now to avoid possible data inconsistency? Nir ___ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
Re: [ovirt-devel] [VDSM] Live snapshot with ceph disks
- Original Message - From: Christopher Pereira krip...@imatronix.cl To: Nir Soffer nsof...@redhat.com, devel@ovirt.org Cc: Eric Blake ebl...@redhat.com Sent: Saturday, June 20, 2015 9:34:57 AM Subject: Re: [ovirt-devel] [VDSM] Live snapshot with ceph disks Hi Nir, Regarding 3. Engine creates snapshot *via cinder*... What are the benefits of creating snapshots via cinder vs via libvirt? Ceph provides thin provisioning and snapshot in the server side, which is more efficient and simpler to use. ceph disk use raw format, so we cannot use qcow based snapshots. Libvirt and qemu are offering core VM-aware storage and memory snapshot features. Besides, snapshot-create-as has no VM downtime. We don't plan to introduce downtime. It would be a mistake to implement snapshoting on the ceph layer. At some point, you would need VM-aware code (eg: the VM memory state) and organically go back to the libvirt + qemu way. We will use libvirt to create memory snapshot, stored in ceph disks instead of a vdsm image. There seems to be qemu + libvirt support for ceph snapshots (via rbd commands) which probably offers some (?) VM-awareness, but what are the benefits of not using the good old core libvirt + qemu snapshot features? I must be missing something... We want to support smart storage servers, offloading storage operations to the server. We also want to leverage the rich echo system of cinder, supported by many storage vendors. So engine is creating ceph volumes and snapshots via cinder, and vdsm is consuming the volumes via libvirt/qemu network disk support. 2) Not related: It seems like oVirt shifted focus towards Ceph recently... I would like to drop Gluster for Ceph if the latter supports SEEK HOLE reading and optimal sparse files operations. Can someone please confirm if Ceph is supporting SEEK_HOLE? I saw some related code, but would like to ask for comments before setting up and benchmarking Ceph sparse image file operations. Ceph provides block storage, and gluster provides file-based storage. We are focused on providing both options so user can choose what works best. Nir ___ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
Re: [ovirt-devel] [VDSM] Live snapshot with ceph disks
Hi Nir, Regarding 3. Engine creates snapshot *via cinder*... What are the benefits of creating snapshots via cinder vs via libvirt? Libvirt and qemu are offering core VM-aware storage and memory snapshot features. Besides, snapshot-create-as has no VM downtime. It would be a mistake to implement snapshoting on the ceph layer. At some point, you would need VM-aware code (eg: the VM memory state) and organically go back to the libvirt + qemu way. There seems to be qemu + libvirt support for ceph snapshots (via rbd commands) which probably offers some (?) VM-awareness, but what are the benefits of not using the good old core libvirt + qemu snapshot features? I must be missing something... 2) Not related: It seems like oVirt shifted focus towards Ceph recently... I would like to drop Gluster for Ceph if the latter supports SEEK HOLE reading and optimal sparse files operations. Can someone please confirm if Ceph is supporting SEEK_HOLE? I saw some related code, but would like to ask for comments before setting up and benchmarking Ceph sparse image file operations. ___ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
[ovirt-devel] [VDSM] Live snapshot with ceph disks
Hi all, For 3.6, we will not support live vm snapshot, but this is a must for the next release. It is trivial to create a disk snapshot in ceph (using cinder apis). The snapshot is transparent to libvirt, qmeu and the guest os. However, we want to create a consistent snapshot, so you can revert to the disk snapshot and get a consistent file system state. We also want to create a complete vm snapshot, including all disks and vm memory. Libvirt and qemu provides that when given a new disk for the active layer, but when using ceph disk, we don't change the active layer - we continue to use the same disk. Since 1.2.5, libvirt provides virDomainFSFreeze and virDomainFSThaw: https://libvirt.org/hvsupport.html So here is possible flows (ignoring engine side stuff like locking vms and disks) Disk snapshot - 1. Engine invoke VM.freezeFileSystems 2. Vdsm invokes libvirt.virDomainFSFreeze 3. Engine creates snapshot via cinder 4. Engine invokes VM.thawFileSystems 5. Vdsm invokes livbirt.virDomainFSThaw Vm snapshot --- 1. Engine invoke VM.freezeFileSystems 2. Vdsm invokes libvirt.virDomainFSFreeze 3. Engine creates snapshot via cinder 4. Engine invokes VM.snapshot 5. Vdsm creates snapshot, skipping ceph disks 6. Engine invokes VM.thawFileSystems 7. Vdsm invokes livbirt.virDomainFSThaw API changes --- New verbs: - VM.freezeFileSystems - basically invokes virDomainFSFreeze - VM.thawFileSystems - basically invokes virDomainFSThaw What do you think? Nir ___ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel