[ceph-users] Re: [EXTERNAL] Upgrading nautilus / centos7 to octopus / ubuntu 20.04. - Suggestions and hints? - Thanks

2023-08-04 Thread Götz Reinicke
Hi, thanks to all suggestions.

Right now, it is step by step that works: going to bionic/nautilus …and from 
that like Josh noted.

We encountered a problem which I'll post separately .

Best . Götz

> Am 03.08.2023 um 15:44 schrieb Beaman, Joshua :
> 
> We went through this exercise, though our starting point was ubuntu 16.04 / 
> nautilus.  We reduced our double builds as follows:
> 
> Rebuild each monitor host on 18.04/bionic and rejoin still on nautilus
> Upgrade all mons, mgrs., (and rgws optionally) to pacific
> Convert each mon, mgr, rgw to cephadm and enable orchestrator
> Rebuild each mon, mgr, rgw on 20.04/focal and rejoin pacfic cluster
> Drain and rebuild each osd host on focal and pacific
>  
> This has the advantage of only having to drain and rebuild the OSD hosts 
> once.  Double building the control cluster hosts isn’t so bad, and 
> orchestrator makes all of the ceph parts easy once it’s enabled.
>  
> The biggest challenge we ran into was: https://tracker.ceph.com/issues/51652 
> because we still had a lot of filestore osds.  It’s frustrating, but we 
> managed to get through it without much client interruption on a dozen prod 
> clusters, most of which were 38 osd hosts and 912 total osds each.  One thing 
> which helped, was, before beginning the osd host builds, set all of the old 
> osds primary-affinity to something <1.  This way when the new pacific (or 
> octopus) osds join the cluster they will automatically be favored for primary 
> on their pgs.  If a heartbeat timeout storm starts to get out of control, 
> start by setting nodown and noout.  The flapping osds are the worst.  Then 
> figure out which osds are the culprit and restart them.
>  
> Hopefully your nautilus osds are all bluestore and you won’t have this 
> problem.  We put up with it, because the filestore to bluestore conversion 
> was one of the most important parts of this upgrade for us.
>  
> Best of luck, whatever route you take.
>  
> Regards, 
> Josh Beaman



smime.p7s
Description: S/MIME cryptographic signature
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: [EXTERNAL] Upgrading nautilus / centos7 to octopus / ubuntu 20.04. - Suggestions and hints?

2023-08-03 Thread Beaman, Joshua
We went through this exercise, though our starting point was ubuntu 16.04 / 
nautilus.  We reduced our double builds as follows:


  1.  Rebuild each monitor host on 18.04/bionic and rejoin still on nautilus
  2.  Upgrade all mons, mgrs., (and rgws optionally) to pacific
  3.  Convert each mon, mgr, rgw to cephadm and enable orchestrator
  4.  Rebuild each mon, mgr, rgw on 20.04/focal and rejoin pacfic cluster
  5.  Drain and rebuild each osd host on focal and pacific

This has the advantage of only having to drain and rebuild the OSD hosts once.  
Double building the control cluster hosts isn’t so bad, and orchestrator makes 
all of the ceph parts easy once it’s enabled.

The biggest challenge we ran into was: https://tracker.ceph.com/issues/51652 
because we still had a lot of filestore osds.  It’s frustrating, but we managed 
to get through it without much client interruption on a dozen prod clusters, 
most of which were 38 osd hosts and 912 total osds each.  One thing which 
helped, was, before beginning the osd host builds, set all of the old osds 
primary-affinity to something <1.  This way when the new pacific (or octopus) 
osds join the cluster they will automatically be favored for primary on their 
pgs.  If a heartbeat timeout storm starts to get out of control, start by 
setting nodown and noout.  The flapping osds are the worst.  Then figure out 
which osds are the culprit and restart them.

Hopefully your nautilus osds are all bluestore and you won’t have this problem. 
 We put up with it, because the filestore to bluestore conversion was one of 
the most important parts of this upgrade for us.

Best of luck, whatever route you take.

Regards,
Josh Beaman

From: Götz Reinicke 
Date: Tuesday, August 1, 2023 at 1:01 PM
To: ceph-users@ceph.io 
Subject: [EXTERNAL] [ceph-users] Upgrading nautilus / centos7 to octopus / 
ubuntu 20.04. - Suggestions and hints?
Hi,

As I’v read and thought a lot about the migration as this is a bigger project, 
I was wondering if anyone has done that already and might share some notes or 
playbooks, because in all readings there where some parts missing or miss 
understandable to me.

I do have some different approaches in mind, so may be you have some 
suggestions or hints.

a) upgrade nautilus on centos 7 with the few missing features like dashboard 
and prometheus. After that migrate one node after an other to ubuntu 20.04 with 
octopus and than upgrade ceph to the recent stable version.

b) migrate one node after an other to ubuntu 18.04 with nautilus and then 
upgrade to octupus and after that to ubuntu 20.04.

or

c) upgrade one node after an other to ubuntu 20.04 with octopus and join it to 
the cluster until all nodes are upgraded.


For test I tried c) with a mon node, but adding that to the cluster fails with 
some failed state, still probing for the other mons. (I dont have the right log 
at hand right now.)

So my questions are:

a) What would be the best (most stable) migration path and

b) is it in general possible to add a new octopus mon (not upgraded one) to a 
nautilus cluster, where the other mons are still on nautilus?


I hope my thoughts and questions are understandable :)

Thanks for any hint and suggestion. Best . Götz
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io