[ceph-users] Re: replacing all disks in a stretch mode ceph cluster

2023-07-19 Thread Joachim Kraftmayer - ceph ambassador

Hi,

short note if you replace the disks with large disks, the weight of the 
osd and host will change and this will force data migration.


Perhaps you read a bit more about the upmap balancer, if you want to 
avoid data migration during the upgrade phase.


Regards, Joachim

___
ceph ambassador DACH
ceph consultant since 2012

Clyso GmbH - Premier Ceph Foundation Member

https://www.clyso.com/

Am 19.07.23 um 09:00 schrieb Eugen Block:

Hi,
during cluster upgrades from L to N or later one had to rebuild OSDs 
which were originally deployed by ceph-disk switching to ceph-volume. 
We've done this on multiple clusters and redeployed one node by one. 
We did not drain the nodes beforehand because the EC resiliency 
configuration was well planned. So we also didn't set any flags 
because tearing down OSDs only took a few minutes, we didn't care if a 
few MB or GB had been recovered. You just need to be careful with the 
mon_max_pg_per_osd limit (default 250) as this can easily be hit with 
the first rebuilt OSD that's coming up, or the last one being removed 
as well. So you could temporarily increase it during the rebuilt 
procedure. In your case you probably should set flags because you'll 
need time to exchange the physical disks, I guess the ones you already 
mentioned would suffice. With stretch mode and replicated size 4 you 
should also be able to tear down an entire node at a time and rebuilt 
it, I wouldn't rebuild more than one though.
I have not done this yet with cephadm clusters but if your 
drivegroup.yml worked correctly before it should work here as well, 
hoping that you don't find another ceph-volume bug. ;-)


Regards,
Eugen

Zitat von Zoran Bošnjak :


Hello ceph users,
my ceph configuration is
- ceph version 17.2.5 on ubuntu 20.04
- stretch mode
- 2 rooms with OSDs and monitors + additional room for the tiebreaker 
monitor

- 4 OSD servers in each room
- 6 OSDs per OSD server
- ceph installation/administration is manual (without ansible, 
orch... or any other tool like this)


Ceph health is currently OK.
Raw usage is around 60%,
Pools usage is below 75%

I need to replace all OSD disks in the cluster with larger capacity 
disks (500G to 1000G). So the eventual configuration will contain the 
same number of OSDs and servers.


I understand I can replace OSDs one by one, following the documented 
procedure (removing old and adding new OSD to the configuration) and 
waiting for health OK. But in this case, ceph will probably copy data 
around like crazy after each step. So, my question is:


What is the recommended procedure in this case of replacing ALL disks 
and keeping the ceph operational during the upgrade?


In particular:
Should I use any of "nobackfill, norebalance, norecover..." flags 
during the process? If yes, which?
Should I do one OSD at the time, server at the time or even room at 
the time?


Thanks for the suggestions.

regards,
Zoran
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io



___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: replacing all disks in a stretch mode ceph cluster

2023-07-19 Thread Eugen Block

Hi,
during cluster upgrades from L to N or later one had to rebuild OSDs  
which were originally deployed by ceph-disk switching to ceph-volume.  
We've done this on multiple clusters and redeployed one node by one.  
We did not drain the nodes beforehand because the EC resiliency  
configuration was well planned. So we also didn't set any flags  
because tearing down OSDs only took a few minutes, we didn't care if a  
few MB or GB had been recovered. You just need to be careful with the  
mon_max_pg_per_osd limit (default 250) as this can easily be hit with  
the first rebuilt OSD that's coming up, or the last one being removed  
as well. So you could temporarily increase it during the rebuilt  
procedure. In your case you probably should set flags because you'll  
need time to exchange the physical disks, I guess the ones you already  
mentioned would suffice. With stretch mode and replicated size 4 you  
should also be able to tear down an entire node at a time and rebuilt  
it, I wouldn't rebuild more than one though.
I have not done this yet with cephadm clusters but if your  
drivegroup.yml worked correctly before it should work here as well,  
hoping that you don't find another ceph-volume bug. ;-)


Regards,
Eugen

Zitat von Zoran Bošnjak :


Hello ceph users,
my ceph configuration is
- ceph version 17.2.5 on ubuntu 20.04
- stretch mode
- 2 rooms with OSDs and monitors + additional room for the tiebreaker monitor
- 4 OSD servers in each room
- 6 OSDs per OSD server
- ceph installation/administration is manual (without ansible,  
orch... or any other tool like this)


Ceph health is currently OK.
Raw usage is around 60%,
Pools usage is below 75%

I need to replace all OSD disks in the cluster with larger capacity  
disks (500G to 1000G). So the eventual configuration will contain  
the same number of OSDs and servers.


I understand I can replace OSDs one by one, following the documented  
procedure (removing old and adding new OSD to the configuration) and  
waiting for health OK. But in this case, ceph will probably copy  
data around like crazy after each step. So, my question is:


What is the recommended procedure in this case of replacing ALL  
disks and keeping the ceph operational during the upgrade?


In particular:
Should I use any of "nobackfill, norebalance, norecover..." flags  
during the process? If yes, which?

Should I do one OSD at the time, server at the time or even room at the time?

Thanks for the suggestions.

regards,
Zoran
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io



___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io