[ceph-users] Re: Replace OSD while cluster is recovering?

Gustavo Garcia Rondina Fri, 28 Feb 2025 12:13:49 -0800

Hi Frédéric,

Thank you for the suggestion. I started `ceph pg repair {pgid}` inconsistent 
PGs but so far, no effect to be seen. Is it possible to monitor the progress of 
the repairs? With `ceph progress` I can't see it, and for some reason `ceph -w` 
is hanging.


Kind regards,
Gustavo

________________________________
From: Frédéric Nass <[email protected]>
Sent: Friday, February 28, 2025 11:19 AM
To: Gustavo Garcia Rondina <[email protected]>
Cc: ceph-users <[email protected]>
Subject: Re: [ceph-users] Replace OSD while cluster is recovering?

Hi Gustavo,

In your situation, I would run a 'ceph pg repair {pgid}' on each one of these 
inconsistent PGs reported by 'ceph health detail' so they eventually get 
active+clean ASAP.

And I would leave scrubbing enabled and set osd_scrub_auto_repair to true with 
a 'ceph config set osd osd_scrub_auto_repair true' so that inconsistent PGs get 
automatically repaired at scrubbing time.

Regards,
Frédéric.

----- Le 28 Fév 25, à 16:56, Gustavo Garcia Rondina [email protected] a 
écrit :

> Hello list,
>
> We have a Ceph cluster (17.2.6 quincy) with 2 admin nodes and 6 storage nodes,
> each storage node connected to a JBOD enclosure. Each enclosure houses 28 HDD
> disks of 18 TB size, totaling 168 OSDs. The pool that houses the majority of
> the data is erasure-coded (4+2). We have recently had one disk failure, which
> brought one OSD down:
>
> # ceph osd tree | grep down
>  2    hdd    16.49579          osd.2         down         0  1.00000
>
> This OSD is out of the cluster, but we haven't replaced it physically yet. The
> problem that we are facing is that the cluster was not in the best shape when
> this OSD failed. Currently we have the following:
>
> ################################################
>  cluster:
>    id:     <redacted>
>    health: HEALTH_ERR
>            1026 scrub errors
>            Possible data damage: 18 pgs inconsistent
>            2137 pgs not deep-scrubbed in time
>            2137 pgs not scrubbed in time
>
>  services:
>    mon: 5 daemons, quorum xyz-admin1,xyz-admin2,xyz-osd1,xyz-osd2,xyz-osd3 
> (age
>    17M)
>    mgr: xyz-admin2.sipadf(active, since 17M), standbys: xyz-admin1.nwaovh
>    mds: 2/2 daemons up, 2 standby
>    osd: 168 osds: 167 up (since 44h), 167 in (since 6w); 220 remapped pgs
>
>  data:
>    volumes: 2/2 healthy
>    pools:   9 pools, 2137 pgs
>    objects: 448.54M objects, 1.0 PiB
>    usage:   1.6 PiB used, 1.1 PiB / 2.7 PiB avail
>    pgs:     134404830/2676514497 objects misplaced (5.022%)
>             1902 active+clean
>             191  active+remapped+backfilling
>             26   active+remapped+backfill_wait
>             15   active+clean+inconsistent
>             2    active+remapped+inconsistent+backfilling
>             1    active+remapped+inconsistent+backfill_wait
>
>  io:
>    recovery: 597 MiB/s, 252 objects/s
>
>  progress:
>    Global Recovery Event (6w)
>      [=========================...] (remaining: 5d)
> ################################################
>
> I have noticed the number of active+clean increasing (was ~1750 two days ago),
> and objects misplaced very slowly decreasing. My question is, should I wait
> until recovery is complete, then repair the 18 damaged pg, and only then
> replace the disk? My thinking is that replacing the disk will trigger more
> backfilling which will slow down the recovering even more.
>
> Another question, should I disable scrubbing while the recovery is not
> finalized?
>
> Thank you for any insights you may be able to provide!
> -
> Gustavo
> _______________________________________________
> ceph-users mailing list -- [email protected]
> To unsubscribe send an email to [email protected]
_______________________________________________
ceph-users mailing list -- [email protected]
To unsubscribe send an email to [email protected]

[ceph-users] Re: Replace OSD while cluster is recovering?

Reply via email to