[ceph-users] Re: cephfs-top causes 16 mgr modules have recently crashed

2024-01-22 Thread Özkan Göksu
Hello Jos. Thank you for the reply. I can upgrade to 17.2.7 but I wonder can I only upgrade MON+MGR for this issue or do I need to upgrade all the parts? Otherwise I need to wait few weeks. I don't want to request maintenance during delivery time. root@ud-01:~# ceph orch upgrade ls {

[ceph-users] Re: cephfs-top causes 16 mgr modules have recently crashed

2024-01-22 Thread Jos Collin
Please have this fix: https://tracker.ceph.com/issues/59551. It's backported to quincy. On 23/01/24 03:11, Özkan Göksu wrote: Hello When I run cephfs-top it causes mgr module crash. Can you please tell me the reason? My environment: My ceph version 17.2.6 Operating System: Ubuntu 22.04.2 LTS

[ceph-users] cephfs-top causes 16 mgr modules have recently crashed

2024-01-22 Thread Özkan Göksu
Hello When I run cephfs-top it causes mgr module crash. Can you please tell me the reason? My environment: My ceph version 17.2.6 Operating System: Ubuntu 22.04.2 LTS Kernel: Linux 5.15.0-84-generic I created the cephfs-top user with the following command: ceph auth get-or-create client.fstop

[ceph-users] Re: Degraded PGs on EC pool when marking an OSD out

2024-01-22 Thread Hector Martin
On 2024/01/22 19:06, Frank Schilder wrote: > You seem to have a problem with your crush rule(s): > > 14.3d ... [18,17,16,3,1,0,NONE,NONE,12] > > If you really just took out 1 OSD, having 2xNONE in the acting set indicates > that your crush rule can't find valid mappings. You might need to tune

[ceph-users] Re: OSD read latency grows over time

2024-01-22 Thread Roman Pashin
> > Hi Mark, thank you for prompt answer. The fact that changing the pg_num for the index pool drops the latency > back down might be a clue. Do you have a lot of deletes happening on > this cluster? If you have a lot of deletes and long pauses between > writes, you could be accumulating

[ceph-users] Re: Degraded PGs on EC pool when marking an OSD out

2024-01-22 Thread Frank Schilder
You seem to have a problem with your crush rule(s): 14.3d ... [18,17,16,3,1,0,NONE,NONE,12] If you really just took out 1 OSD, having 2xNONE in the acting set indicates that your crush rule can't find valid mappings. You might need to tune crush tunables:

[ceph-users] Scrubbing?

2024-01-22 Thread Jan Marek
Hello, last week I've got a HEALTH_OK on our CEPH cluster and I started upgrade firmware in network cards. When I had upgraded the sixth card from nine (one-by-one), this server didn't started correctly and our ProxMox had problem with accessing disk images on CEPH. rbd ls pool was OK, but: