Re: [ceph-users] KVM+Ceph: Live migration of I/O-heavy VM
> > Assuming everything is on LVM including the root filesystem, only moving > > the boot partition will have to be done outside of LVM. > > Since the OP mentioned MS Exchange, I assume the VM is running windows. > You can do the same LVM-like trick in Windows Server via Disk Manager > though; add the new ceph RBD disk to the existing data volume as a > mirror; wait for it to sync, then break the mirror and remove the > original disk. Mirrors only work on dynamic disks which are a pain to revert and cause lot's of problems with backup solutions. I will keep this in mind as this is still better than shutting down the whole VM. @all Thank you very much for your inputs. I will try some less important VMs and then start migration of the big one. Kind regards Kevin ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] KVM+Ceph: Live migration of I/O-heavy VM
On 11.12.2018 12:59, Kevin Olbrich wrote: Hi! Currently I plan a migration of a large VM (MS Exchange, 300 Mailboxes and 900GB DB) from qcow2 on ext4 (RAID1) to an all-flash Ceph luminous cluster (which already holds lot's of images). The server has access to both local and cluster-storage, I only need to live migrate the storage, not machine. I have never used live migration as it can cause more issues and the VMs that are already migrated, had planned downtime. Taking the VM offline and convert/import using qemu-img would take some hours but I would like to still serve clients, even if it is slower. The VM is I/O-heavy in terms of the old storage (LSI/Adaptec with BBU). There are two HDDs bound as RAID1 which are constantly under 30% - 60% load (this goes up to 100% during reboot, updates or login prime-time). What happens when either the local compute node or the ceph cluster fails (degraded)? Or network is unavailable? Are all writes performed to both locations? Is this fail-safe? Or does the VM crash in worst case, which can lead to dirty shutdown for MS-EX DBs? the disk is on the source location untill the migration is finalized. if the local compute node crashed and the vm dies with it before the migration is done. the disk is on the source location as expected. if nodes on the ceph cluster dies but the cluster is operational, ceph just selfheal and the migration is finished. if the cluster dies hard enough to actually break, the migration will timeout , and abort. and disk remains on source location. if network is unavailable the transfer will also timeout. good luck Ronny Aasen ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] KVM+Ceph: Live migration of I/O-heavy VM
On 11.12.2018 17:39, Lionel Bouton wrote: Le 11/12/2018 à 15:51, Konstantin Shalygin a écrit : Currently I plan a migration of a large VM (MS Exchange, 300 Mailboxes and 900GB DB) from qcow2 on ext4 (RAID1) to an all-flash Ceph luminous cluster (which already holds lot's of images). The server has access to both local and cluster-storage, I only need to live migrate the storage, not machine. I have never used live migration as it can cause more issues and the VMs that are already migrated, had planned downtime. Taking the VM offline and convert/import using qemu-img would take some hours but I would like to still serve clients, even if it is slower. The VM is I/O-heavy in terms of the old storage (LSI/Adaptec with BBU). There are two HDDs bound as RAID1 which are constantly under 30% - 60% load (this goes up to 100% during reboot, updates or login prime-time). What happens when either the local compute node or the ceph cluster fails (degraded)? Or network is unavailable? Are all writes performed to both locations? Is this fail-safe? Or does the VM crash in worst case, which can lead to dirty shutdown for MS-EX DBs? The node currently has 4GB free RAM and 29GB listed as cache / available. These numbers need caution because we have "tuned" enabled which causes de-deplication on RAM and this host runs about 10 Windows VMs. During reboots or updates, RAM can get full again. Maybe I am to cautious about live-storage-migration, maybe I am not. What are your experiences or advices? Thank you very much! I was read your message two times and still can't figure out what is your question? You need move your block image from some storage to Ceph? No, you can't do this without downtime because fs consistency. You can easy migrate your filesystem via rsync for example, with small downtime for reboot VM. I believe OP is trying to use the storage migration feature of QEMU. I've never tried it and I wouldn't recommend it (probably not very tested and there is a large window for failure). use the qemu storage migration feature via proxmox webui several times a day. never any issues. I regularly migrate between ceph rbd, local directory, shared lvm over fiberchannel, nfs server. super easy and convenient. Ronny Aasen ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] KVM+Ceph: Live migration of I/O-heavy VM
We are using qemu storage migration regularly via proxmox Works fine, you can go on On 12/11/2018 05:39 PM, Lionel Bouton wrote: > > I believe OP is trying to use the storage migration feature of QEMU. > I've never tried it and I wouldn't recommend it (probably not very > tested and there is a large window for failure). > > One tactic that can be used assuming OP is using LVM in the VM for > storage is to add a Ceph volume to the VM (probably needs a reboot) add > the corresponding virtual disk to the VM volume group and then migrate > all data from the logical volume(s) to the new disk. LVM is using > mirroring internally during the transfer so you get robustness by using > it. It can be slow (especially with old kernels) but at least it is > safe. I've done a DRBD to Ceph migration with this process 5 years ago. > When all logical volumes are moved to the new disk you can remove the > old disk from the volume group. > > Assuming everything is on LVM including the root filesystem, only moving > the boot partition will have to be done outside of LVM. > > Best regards, > > Lionel > > > > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] KVM+Ceph: Live migration of I/O-heavy VM
On 12/11/2018 10:39 AM, Lionel Bouton wrote: Le 11/12/2018 à 15:51, Konstantin Shalygin a écrit : Currently I plan a migration of a large VM (MS Exchange, 300 Mailboxes and 900GB DB) from qcow2 on ext4 (RAID1) to an all-flash Ceph luminous cluster (which already holds lot's of images). The server has access to both local and cluster-storage, I only need to live migrate the storage, not machine. I have never used live migration as it can cause more issues and the VMs that are already migrated, had planned downtime. Taking the VM offline and convert/import using qemu-img would take some hours but I would like to still serve clients, even if it is slower. I believe OP is trying to use the storage migration feature of QEMU. I've never tried it and I wouldn't recommend it (probably not very tested and there is a large window for failure). One tactic that can be used assuming OP is using LVM in the VM for storage is to add a Ceph volume to the VM (probably needs a reboot) add the corresponding virtual disk to the VM volume group and then migrate all data from the logical volume(s) to the new disk. LVM is using mirroring internally during the transfer so you get robustness by using it. It can be slow (especially with old kernels) but at least it is safe. I've done a DRBD to Ceph migration with this process 5 years ago. When all logical volumes are moved to the new disk you can remove the old disk from the volume group. Assuming everything is on LVM including the root filesystem, only moving the boot partition will have to be done outside of LVM. Since the OP mentioned MS Exchange, I assume the VM is running windows. You can do the same LVM-like trick in Windows Server via Disk Manager though; add the new ceph RBD disk to the existing data volume as a mirror; wait for it to sync, then break the mirror and remove the original disk. -- Graham Allan Minnesota Supercomputing Institute - g...@umn.edu ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] KVM+Ceph: Live migration of I/O-heavy VM
Le 11/12/2018 à 15:51, Konstantin Shalygin a écrit : > >> Currently I plan a migration of a large VM (MS Exchange, 300 Mailboxes >> and 900GB DB) from qcow2 on ext4 (RAID1) to an all-flash Ceph luminous >> cluster (which already holds lot's of images). >> The server has access to both local and cluster-storage, I only need >> to live migrate the storage, not machine. >> >> I have never used live migration as it can cause more issues and the >> VMs that are already migrated, had planned downtime. >> Taking the VM offline and convert/import using qemu-img would take >> some hours but I would like to still serve clients, even if it is >> slower. >> >> The VM is I/O-heavy in terms of the old storage (LSI/Adaptec with >> BBU). There are two HDDs bound as RAID1 which are constantly under 30% >> - 60% load (this goes up to 100% during reboot, updates or login >> prime-time). >> >> What happens when either the local compute node or the ceph cluster >> fails (degraded)? Or network is unavailable? >> Are all writes performed to both locations? Is this fail-safe? Or does >> the VM crash in worst case, which can lead to dirty shutdown for MS-EX >> DBs? >> >> The node currently has 4GB free RAM and 29GB listed as cache / >> available. These numbers need caution because we have "tuned" enabled >> which causes de-deplication on RAM and this host runs about 10 Windows >> VMs. >> During reboots or updates, RAM can get full again. >> >> Maybe I am to cautious about live-storage-migration, maybe I am not. >> >> What are your experiences or advices? >> >> Thank you very much! > > I was read your message two times and still can't figure out what is > your question? > > You need move your block image from some storage to Ceph? No, you > can't do this without downtime because fs consistency. > > You can easy migrate your filesystem via rsync for example, with small > downtime for reboot VM. > I believe OP is trying to use the storage migration feature of QEMU. I've never tried it and I wouldn't recommend it (probably not very tested and there is a large window for failure). One tactic that can be used assuming OP is using LVM in the VM for storage is to add a Ceph volume to the VM (probably needs a reboot) add the corresponding virtual disk to the VM volume group and then migrate all data from the logical volume(s) to the new disk. LVM is using mirroring internally during the transfer so you get robustness by using it. It can be slow (especially with old kernels) but at least it is safe. I've done a DRBD to Ceph migration with this process 5 years ago. When all logical volumes are moved to the new disk you can remove the old disk from the volume group. Assuming everything is on LVM including the root filesystem, only moving the boot partition will have to be done outside of LVM. Best regards, Lionel ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] KVM+Ceph: Live migration of I/O-heavy VM
Currently I plan a migration of a large VM (MS Exchange, 300 Mailboxes and 900GB DB) from qcow2 on ext4 (RAID1) to an all-flash Ceph luminous cluster (which already holds lot's of images). The server has access to both local and cluster-storage, I only need to live migrate the storage, not machine. I have never used live migration as it can cause more issues and the VMs that are already migrated, had planned downtime. Taking the VM offline and convert/import using qemu-img would take some hours but I would like to still serve clients, even if it is slower. The VM is I/O-heavy in terms of the old storage (LSI/Adaptec with BBU). There are two HDDs bound as RAID1 which are constantly under 30% - 60% load (this goes up to 100% during reboot, updates or login prime-time). What happens when either the local compute node or the ceph cluster fails (degraded)? Or network is unavailable? Are all writes performed to both locations? Is this fail-safe? Or does the VM crash in worst case, which can lead to dirty shutdown for MS-EX DBs? The node currently has 4GB free RAM and 29GB listed as cache / available. These numbers need caution because we have "tuned" enabled which causes de-deplication on RAM and this host runs about 10 Windows VMs. During reboots or updates, RAM can get full again. Maybe I am to cautious about live-storage-migration, maybe I am not. What are your experiences or advices? Thank you very much! I was read your message two times and still can't figure out what is your question? You need move your block image from some storage to Ceph? No, you can't do this without downtime because fs consistency. You can easy migrate your filesystem via rsync for example, with small downtime for reboot VM. k ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com