Re: [ceph-users] KVM+Ceph: Live migration of I/O-heavy VM

2018-12-11 Thread Kevin Olbrich
> > Assuming everything is on LVM including the root filesystem, only moving
> > the boot partition will have to be done outside of LVM.
>
> Since the OP mentioned MS Exchange, I assume the VM is running windows.
> You can do the same LVM-like trick in Windows Server via Disk Manager
> though; add the new ceph RBD disk to the existing data volume as a
> mirror; wait for it to sync, then break the mirror and remove the
> original disk.

Mirrors only work on dynamic disks which are a pain to revert and
cause lot's of problems with backup solutions.
I will keep this in mind as this is still better than shutting down
the whole VM.

@all
Thank you very much for your inputs. I will try some less important
VMs and then start migration of the big one.

Kind regards
Kevin
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] KVM+Ceph: Live migration of I/O-heavy VM

2018-12-11 Thread Ronny Aasen

On 11.12.2018 12:59, Kevin Olbrich wrote:

Hi!

Currently I plan a migration of a large VM (MS Exchange, 300 Mailboxes
and 900GB DB) from qcow2 on ext4 (RAID1) to an all-flash Ceph luminous
cluster (which already holds lot's of images).
The server has access to both local and cluster-storage, I only need
to live migrate the storage, not machine.

I have never used live migration as it can cause more issues and the
VMs that are already migrated, had planned downtime.
Taking the VM offline and convert/import using qemu-img would take
some hours but I would like to still serve clients, even if it is
slower.

The VM is I/O-heavy in terms of the old storage (LSI/Adaptec with
BBU). There are two HDDs bound as RAID1 which are constantly under 30%
- 60% load (this goes up to 100% during reboot, updates or login
prime-time).

What happens when either the local compute node or the ceph cluster
fails (degraded)? Or network is unavailable?
Are all writes performed to both locations? Is this fail-safe? Or does
the VM crash in worst case, which can lead to dirty shutdown for MS-EX
DBs?


the disk is on the source location untill the migration is finalized. if 
the local compute node crashed and the vm dies with it before the 
migration is done. the disk is on the source location as expected.  if 
nodes on the ceph cluster dies but the cluster is operational, ceph just 
selfheal and the migration is finished. if the cluster dies hard enough 
to actually break, the migration will timeout , and abort. and disk 
remains on source location. if network is unavailable the transfer will 
also timeout.


good luck

Ronny Aasen




___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] KVM+Ceph: Live migration of I/O-heavy VM

2018-12-11 Thread Ronny Aasen

On 11.12.2018 17:39, Lionel Bouton wrote:

Le 11/12/2018 à 15:51, Konstantin Shalygin a écrit :



Currently I plan a migration of a large VM (MS Exchange, 300 Mailboxes
and 900GB DB) from qcow2 on ext4 (RAID1) to an all-flash Ceph luminous
cluster (which already holds lot's of images).
The server has access to both local and cluster-storage, I only need
to live migrate the storage, not machine.

I have never used live migration as it can cause more issues and the
VMs that are already migrated, had planned downtime.
Taking the VM offline and convert/import using qemu-img would take
some hours but I would like to still serve clients, even if it is
slower.

The VM is I/O-heavy in terms of the old storage (LSI/Adaptec with
BBU). There are two HDDs bound as RAID1 which are constantly under 30%
- 60% load (this goes up to 100% during reboot, updates or login
prime-time).

What happens when either the local compute node or the ceph cluster
fails (degraded)? Or network is unavailable?
Are all writes performed to both locations? Is this fail-safe? Or does
the VM crash in worst case, which can lead to dirty shutdown for MS-EX
DBs?

The node currently has 4GB free RAM and 29GB listed as cache /
available. These numbers need caution because we have "tuned" enabled
which causes de-deplication on RAM and this host runs about 10 Windows
VMs.
During reboots or updates, RAM can get full again.

Maybe I am to cautious about live-storage-migration, maybe I am not.

What are your experiences or advices?

Thank you very much!


I was read your message two times and still can't figure out what is 
your question?


You need move your block image from some storage to Ceph? No, you 
can't do this without downtime because fs consistency.


You can easy migrate your filesystem via rsync for example, with 
small downtime for reboot VM.




I believe OP is trying to use the storage migration feature of QEMU. 
I've never tried it and I wouldn't recommend it (probably not very 
tested and there is a large window for failure).



use the qemu storage migration feature via proxmox webui several times a 
day. never any issues.


I regularly migrate between  ceph rbd,  local directory, shared lvm over 
fiberchannel, nfs server.  super easy and convenient.



Ronny Aasen


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] KVM+Ceph: Live migration of I/O-heavy VM

2018-12-11 Thread Jack
We are using qemu storage migration regularly via proxmox

Works fine, you can go on


On 12/11/2018 05:39 PM, Lionel Bouton wrote:
> 
> I believe OP is trying to use the storage migration feature of QEMU.
> I've never tried it and I wouldn't recommend it (probably not very
> tested and there is a large window for failure).
> 
> One tactic that can be used assuming OP is using LVM in the VM for
> storage is to add a Ceph volume to the VM (probably needs a reboot) add
> the corresponding virtual disk to the VM volume group and then migrate
> all data from the logical volume(s) to the new disk. LVM is using
> mirroring internally during the transfer so you get robustness by using
> it. It can be slow (especially with old kernels) but at least it is
> safe. I've done a DRBD to Ceph migration with this process 5 years ago.
> When all logical volumes are moved to the new disk you can remove the
> old disk from the volume group.
> 
> Assuming everything is on LVM including the root filesystem, only moving
> the boot partition will have to be done outside of LVM.
> 
> Best regards,
> 
> Lionel
> 
> 
> 
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] KVM+Ceph: Live migration of I/O-heavy VM

2018-12-11 Thread Graham Allan



On 12/11/2018 10:39 AM, Lionel Bouton wrote:

Le 11/12/2018 à 15:51, Konstantin Shalygin a écrit :



Currently I plan a migration of a large VM (MS Exchange, 300 Mailboxes
and 900GB DB) from qcow2 on ext4 (RAID1) to an all-flash Ceph luminous
cluster (which already holds lot's of images).
The server has access to both local and cluster-storage, I only need
to live migrate the storage, not machine.

I have never used live migration as it can cause more issues and the
VMs that are already migrated, had planned downtime.
Taking the VM offline and convert/import using qemu-img would take
some hours but I would like to still serve clients, even if it is
slower.
I believe OP is trying to use the storage migration feature of QEMU. 
I've never tried it and I wouldn't recommend it (probably not very 
tested and there is a large window for failure).


One tactic that can be used assuming OP is using LVM in the VM for 
storage is to add a Ceph volume to the VM (probably needs a reboot) add 
the corresponding virtual disk to the VM volume group and then migrate 
all data from the logical volume(s) to the new disk. LVM is using 
mirroring internally during the transfer so you get robustness by using 
it. It can be slow (especially with old kernels) but at least it is 
safe. I've done a DRBD to Ceph migration with this process 5 years ago.
When all logical volumes are moved to the new disk you can remove the 
old disk from the volume group.


Assuming everything is on LVM including the root filesystem, only moving 
the boot partition will have to be done outside of LVM.


Since the OP mentioned MS Exchange, I assume the VM is running windows. 
You can do the same LVM-like trick in Windows Server via Disk Manager 
though; add the new ceph RBD disk to the existing data volume as a 
mirror; wait for it to sync, then break the mirror and remove the 
original disk.


--
Graham Allan
Minnesota Supercomputing Institute - g...@umn.edu
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] KVM+Ceph: Live migration of I/O-heavy VM

2018-12-11 Thread Lionel Bouton
Le 11/12/2018 à 15:51, Konstantin Shalygin a écrit :
>
>> Currently I plan a migration of a large VM (MS Exchange, 300 Mailboxes
>> and 900GB DB) from qcow2 on ext4 (RAID1) to an all-flash Ceph luminous
>> cluster (which already holds lot's of images).
>> The server has access to both local and cluster-storage, I only need
>> to live migrate the storage, not machine.
>>
>> I have never used live migration as it can cause more issues and the
>> VMs that are already migrated, had planned downtime.
>> Taking the VM offline and convert/import using qemu-img would take
>> some hours but I would like to still serve clients, even if it is
>> slower.
>>
>> The VM is I/O-heavy in terms of the old storage (LSI/Adaptec with
>> BBU). There are two HDDs bound as RAID1 which are constantly under 30%
>> - 60% load (this goes up to 100% during reboot, updates or login
>> prime-time).
>>
>> What happens when either the local compute node or the ceph cluster
>> fails (degraded)? Or network is unavailable?
>> Are all writes performed to both locations? Is this fail-safe? Or does
>> the VM crash in worst case, which can lead to dirty shutdown for MS-EX
>> DBs?
>>
>> The node currently has 4GB free RAM and 29GB listed as cache /
>> available. These numbers need caution because we have "tuned" enabled
>> which causes de-deplication on RAM and this host runs about 10 Windows
>> VMs.
>> During reboots or updates, RAM can get full again.
>>
>> Maybe I am to cautious about live-storage-migration, maybe I am not.
>>
>> What are your experiences or advices?
>>
>> Thank you very much!
>
> I was read your message two times and still can't figure out what is
> your question?
>
> You need move your block image from some storage to Ceph? No, you
> can't do this without downtime because fs consistency.
>
> You can easy migrate your filesystem via rsync for example, with small
> downtime for reboot VM.
>

I believe OP is trying to use the storage migration feature of QEMU.
I've never tried it and I wouldn't recommend it (probably not very
tested and there is a large window for failure).

One tactic that can be used assuming OP is using LVM in the VM for
storage is to add a Ceph volume to the VM (probably needs a reboot) add
the corresponding virtual disk to the VM volume group and then migrate
all data from the logical volume(s) to the new disk. LVM is using
mirroring internally during the transfer so you get robustness by using
it. It can be slow (especially with old kernels) but at least it is
safe. I've done a DRBD to Ceph migration with this process 5 years ago.
When all logical volumes are moved to the new disk you can remove the
old disk from the volume group.

Assuming everything is on LVM including the root filesystem, only moving
the boot partition will have to be done outside of LVM.

Best regards,

Lionel
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] KVM+Ceph: Live migration of I/O-heavy VM

2018-12-11 Thread Konstantin Shalygin

Currently I plan a migration of a large VM (MS Exchange, 300 Mailboxes
and 900GB DB) from qcow2 on ext4 (RAID1) to an all-flash Ceph luminous
cluster (which already holds lot's of images).
The server has access to both local and cluster-storage, I only need
to live migrate the storage, not machine.

I have never used live migration as it can cause more issues and the
VMs that are already migrated, had planned downtime.
Taking the VM offline and convert/import using qemu-img would take
some hours but I would like to still serve clients, even if it is
slower.

The VM is I/O-heavy in terms of the old storage (LSI/Adaptec with
BBU). There are two HDDs bound as RAID1 which are constantly under 30%
- 60% load (this goes up to 100% during reboot, updates or login
prime-time).

What happens when either the local compute node or the ceph cluster
fails (degraded)? Or network is unavailable?
Are all writes performed to both locations? Is this fail-safe? Or does
the VM crash in worst case, which can lead to dirty shutdown for MS-EX
DBs?

The node currently has 4GB free RAM and 29GB listed as cache /
available. These numbers need caution because we have "tuned" enabled
which causes de-deplication on RAM and this host runs about 10 Windows
VMs.
During reboots or updates, RAM can get full again.

Maybe I am to cautious about live-storage-migration, maybe I am not.

What are your experiences or advices?

Thank you very much!


I was read your message two times and still can't figure out what is 
your question?


You need move your block image from some storage to Ceph? No, you can't 
do this without downtime because fs consistency.


You can easy migrate your filesystem via rsync for example, with small 
downtime for reboot VM.





k

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com