[ceph-users] Re: Pgs troubleshooting

GLE, Vivien Tue, 29 Jul 2025 07:50:18 -0700

Thanks for your help ! This is my new pg stat with no more peering pgs (after 
rebooting some OSD)


ceph pg stat ->

498 pgs: 1 active+recovery_unfound+degraded, 3 
recovery_unfound+undersized+degraded+remapped+peered, 14 
active+clean+scrubbing+deep, 480 active+clean;

36 GiB data, 169 GiB used, 6.2 TiB / 6.4 TiB avail; 8.8 KiB/s rd, 0 B/s wr, 12 
op/s; 715/41838 objects degraded (1.709%); 5/13946 objects unfound (0.036%)

ceph pg ls recovery_unfound -> shows that PG are replica 3, tried to repair but 
nothing happened


ceph -w ->

osd.1 [ERR] 11.4 has 2 objects unfound and apparently lost



________________________________
De : Frédéric Nass <frederic.n...@clyso.com>
Envoyé : mardi 29 juillet 2025 14:03:37
À : GLE, Vivien
Cc : ceph-users@ceph.io
Objet : Re: [ceph-users] Pgs troubleshooting

Hi Vivien,

Unless you ran 'ceph pg stat' command when peering was occuring, the 37 peering 
PGs might indicate a temporary peering issue with one or more OSDs. If that's 
the case then restarting associated OSDs could help with the peering or ceph 
pg. You could list those PGs and associated OSDs with 'ceph pg ls peering' and 
trigger peering by either restarting one common OSD or by using 'ceph pg repeer 
<pg_id>'.

Regarding the unfound object and its associated backfill_unfound PG, you could 
identify this PG with 'ceph pg ls backfill_unfound' and investigate this PG 
with 'ceph pg <pg_id> query'. Depending on the output, you could try running a 
'ceph pg repair <pg_id>'. Could you confirm that this PG is not part of a 
size=2 pool?

Best regards,
Frédéric.

--
Frédéric Nass
Ceph Ambassador France | Senior Ceph Engineer @ CLYSO
Try our Ceph Analyzer -- https://analyzer.clyso.com/
https://clyso.com | frederic.n...@clyso.com<mailto:frederic.n...@clyso.com>


Le mar. 29 juil. 2025 à 14:19, GLE, Vivien 
<vivien....@inist.fr<mailto:vivien....@inist.fr>> a écrit :
Hi,

After replacing 2 OSD (data corruption), this is the stats of my testing ceph 
cluster

ceph pg stat

498 pgs: 37 peering, 1 active+remapped+backfilling, 1 active+clean+remapped, 1 
active+recovery_wait+undersized+remapped, 1 
backfill_unfound+undersized+degraded+remapped+peered, 1 remapped+peering, 12 
active+clean+scrubbing+deep, 1 active+undersized, 442 active+clean, 1 
active+recovering+undersized+remapped

34 GiB data, 175 GiB used, 6.2 TiB / 6.4 TiB avail; 1.7 KiB/s rd, 1 op/s; 
31/39768 objects degraded (0.078%); 6/39768 objects misplaced (0.015%); 1/13256 
objects unfound (0.008%)

ceph osd stat
7 osds: 7 up (since 20h), 7 in (since 20h); epoch: e427538; 4 remapped pgs

Anyone had an idea of where to start to get a healthy cluster ?

Thanks !

Vivien


_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io<mailto:ceph-users@ceph.io>
To unsubscribe send an email to 
ceph-users-le...@ceph.io<mailto:ceph-users-le...@ceph.io>
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Pgs troubleshooting

Reply via email to