[ceph-users] CEPH complete cluster failure: unknown PGS

v1tnam Thu, 28 Sep 2023 12:24:41 -0700

I have an 8-node cluster with old hardware. a week ago 4 nodes went down and 
the CEPH cluster went nuts.
All pgs became unknown and montors took too long to be in sync.
So i reduced the number of mons to one and mgrs to one as well


Now the recovery starts with 100% unknown pgs and then pgs start to move ot 
inactive . It generally fails to recover in the middle and starts from scratch.

It's hold hardware and OSDs have lots of slow ops and probably number of bad 
sectors as well

Any suggestions on how to tackle this. It's a nautilus cluster and pretty old 
(8-year old hardware)

Thanks
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] CEPH complete cluster failure: unknown PGS

Reply via email to