[ceph-users] Re: Ceph upgrade advice - Luminous to Pacific with OS upgrade

2022-12-06 Thread Massimo Sgaravatto
If it can help, I have recently updated my ceph cluster (composed by 3 mon-mgr nodes and n osd nodes) from Nautilus CentOS7 to Pacific Centos8 stream. First I reinstalled the mon-mgr nodes with Centos8 stream (removing them from the cluster and then re-adding them with the new operating system).

[ceph-users] Re: Ceph upgrade advice - Luminous to Pacific with OS upgrade

2022-12-06 Thread Fox, Kevin M
We went on a couple clusters from ceph-deploy+centos7+nautilus to cephadm+rocky8+pacific using ELevate as one of the steps. Went through octopus as well. ELevate wasn't perfect for us either, but was able to get the job done. Had to test it carefully on the test clusters multiple times to get

[ceph-users] Re: Ceph upgrade advice - Luminous to Pacific with OS upgrade

2022-12-06 Thread Wolfpaw - Dale Corse
Hi David, > Good to hear you had success with the ELevate tool, I'd looked at that but seemed a bit risky. The tool supports Rocky so I may give it a look. Elevate wasn't perfect - we had to manually upgrade some packages from outside repos (ceph, opennebula and salt if memory serves). That

[ceph-users] Re: [SPAM] Ceph upgrade advice - Luminous to Pacific with OS upgrade

2022-12-06 Thread David C
> > I don't think this is necessary. It _is_ necessary to convert all > leveldb to rocksdb before upgrading to Pacific, on both mons and any > filestore OSDs. Thanks, Josh, I guess that explains why some people had issues with Filestore OSDs post Pacific upgrade On Tue, Dec 6, 2022 at 4:07 PM

[ceph-users] Re: [SPAM] Ceph upgrade advice - Luminous to Pacific with OS upgrade

2022-12-06 Thread Josh Baergen
> - you will need to love those filestore OSD’s to Bluestore before hitting > Pacific, might even be part of the Nautilus upgrade. This takes some time if > I remember correctly. I don't think this is necessary. It _is_ necessary to convert all leveldb to rocksdb before upgrading to Pacific, on

[ceph-users] Re: [SPAM] Ceph upgrade advice - Luminous to Pacific with OS upgrade

2022-12-06 Thread David C
Hi Wolfpaw, thanks for the response - Id upgrade to Nautilus on Centos 7 before moving to EL8. We then used > AlmaLinux Elevate to love from 7 to 8 without a reinstall. Rocky has a > similar path I think. > Good to hear you had success with the ELevate tool, I'd looked at that but seemed a bit

[ceph-users] Re: Ceph upgrade advice - Luminous to Pacific with OS upgrade

2022-12-06 Thread Stefan Kooman
On 12/6/22 15:58, David C wrote: Hi All I'm planning to upgrade a Luminous 12.2.10 cluster to Pacific 16.2.10, cluster is primarily used for CephFS, mix of Filestore and Bluestore OSDs, mons/osds collocated, running on CentOS 7 nodes My proposed upgrade path is: Upgrade to Nautilus 14.2.22 ->

[ceph-users] Re: [SPAM] Ceph upgrade advice - Luminous to Pacific with OS upgrade

2022-12-06 Thread Wolfpaw - Dale Corse
We did this (over a longer timespan).. it worked ok. A couple things I’d add: - Id upgrade to Nautilus on Centos 7 before moving to EL8. We then used AlmaLinux Elevate to love from 7 to 8 without a reinstall. Rocky has a similar path I think. - you will need to love those filestore OSD’s to

[ceph-users] Ceph upgrade advice - Luminous to Pacific with OS upgrade

2022-12-06 Thread David C
Hi All I'm planning to upgrade a Luminous 12.2.10 cluster to Pacific 16.2.10, cluster is primarily used for CephFS, mix of Filestore and Bluestore OSDs, mons/osds collocated, running on CentOS 7 nodes My proposed upgrade path is: Upgrade to Nautilus 14.2.22 -> Upgrade to EL8 on the nodes

[ceph-users] Re: octopus rbd cluster just stopped out of nowhere (>20k slow ops)

2022-12-06 Thread Boris Behrens
Hi Janne, that is a really good idea. Thank you. I just saw, that our only ubuntu20.04 got very high %util (all 8TB disks) Devicer/s rkB/s rrqm/s %rrqm r_await rareq-sz w/s wkB/s wrqm/s %wrqm w_await wareq-sz d/s dkB/s drqm/s %drqm d_await dareq-sz aqu-sz

[ceph-users] Orchestrator hanging on 'stuck' nodes

2022-12-06 Thread Ewan Mac Mahon
Dear all, We're having an odd problem with a recently installed Quincy/cephadm cluster on CentOS 8 Stream with Podman, where the orchestrator appears to get wedged and just won't implement any changes. The overall cluster was installed and working for a few weeks, then we added an NFS export

[ceph-users] Re: cephfs snap-mirror stalled

2022-12-06 Thread Venky Shankar
On Tue, Dec 6, 2022 at 6:34 PM Holger Naundorf wrote: > > > > On 06.12.22 09:54, Venky Shankar wrote: > > Hi Holger, > > > > On Tue, Dec 6, 2022 at 1:42 PM Holger Naundorf > > wrote: > >> > >> Hello, > >> we have set up a snap-mirror for a directory on one of our clusters - > >> running ceph

[ceph-users] Re: cephfs snap-mirror stalled

2022-12-06 Thread Holger Naundorf
On 06.12.22 09:54, Venky Shankar wrote: Hi Holger, On Tue, Dec 6, 2022 at 1:42 PM Holger Naundorf wrote: Hello, we have set up a snap-mirror for a directory on one of our clusters - running ceph version ceph version 16.2.7 (dd0603118f56ab514f133c8d2e3adfc983942503) pacific (stable) to

[ceph-users] pacific: ceph-mon services stopped after OSDs are out/down

2022-12-06 Thread Mevludin Blazevic
Hi all, I'm running Pacific with cephadm. After installation, ceph automatically provisoned 5 ceph monitor nodes across the cluster. After a few OSDs crashed due to a hardware related issue with the SAS interface, 3 monitor services are stopped and won't restart again. Is it related to the

[ceph-users] Re: What to expect on rejoining a host to cluster?

2022-12-06 Thread Frank Schilder
Hi Matt, that I'm using re-weights does not mean I would recommend it. There seems something seriously broken with reweights, see this message https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/message/E5BYQ27LRWFNT4M34OYKI2KM27Q3DUY6/ and the thread with it. I have to wait for a client

[ceph-users] Fwd: [MGR] Only 60 trash removal tasks are processed per minute

2022-12-06 Thread sea you
Hi all, Our cluster contains 12 nodes, 120 OSDs (all NVME), and - currently - 4096 PGs in total. We're currently testing a scenario of having 20 thousand - 10G - volumes and then taking snapshots of each one of them. These 20k snapshots are created in just a bit under 2 hours. When we delete

[ceph-users] Re: octopus rbd cluster just stopped out of nowhere (>20k slow ops)

2022-12-06 Thread Janne Johansson
Perhaps run "iostat -xtcy 5" on the OSD hosts to see if any of the drives have weirdly high utilization despite low iops/requests? Den tis 6 dec. 2022 kl 10:02 skrev Boris Behrens : > > Hi Sven, > I am searching really hard for defect hardware, but I am currently out of > ideas: > - checked

[ceph-users] Re: octopus rbd cluster just stopped out of nowhere (>20k slow ops)

2022-12-06 Thread Boris Behrens
Hi Sven, I am searching really hard for defect hardware, but I am currently out of ideas: - checked prometheus stats, but in all that data I don't know what to look for (osd apply latency if very low at the mentioned point and went up to 40ms after all OSDs were restarted) - smartctl shows nothing

[ceph-users] Re: cephfs snap-mirror stalled

2022-12-06 Thread Venky Shankar
Hi Holger, On Tue, Dec 6, 2022 at 1:42 PM Holger Naundorf wrote: > > Hello, > we have set up a snap-mirror for a directory on one of our clusters - > running ceph version > > ceph version 16.2.7 (dd0603118f56ab514f133c8d2e3adfc983942503) pacific > (stable) > > to get mirrorred our other cluster

[ceph-users] cephfs snap-mirror stalled

2022-12-06 Thread Holger Naundorf
Hello, we have set up a snap-mirror for a directory on one of our clusters - running ceph version ceph version 16.2.7 (dd0603118f56ab514f133c8d2e3adfc983942503) pacific (stable) to get mirrorred our other cluster - running ceph version ceph version 16.2.9