[ceph-users] Re: Ceph PG stuck at degraded and very slow

Joshua Blanch Fri, 17 Oct 2025 11:05:22 -0700

Is this the only PG in this undersized + degraded, or maybe could this be
the tailend of a recovery event e.g small % of degraded objects?


ceph status output could help as well


On Thu, 2 Oct 2025 at 17:53, Sa Pham <[email protected]> wrote:

> Hi Anthony,
>
> I dont see any error from dmesg log and smart of those OSDs
>
> Regards
>
> On Fri, 3 Oct 2025 at 07:08 Anthony D'Atri <[email protected]> wrote:
>
> > Check dmesg / var/log/messages on the hosts with that PG’s OSDs for
> > errors, and run a `smartctl -a` on each of those OSDs’ drives.
> >
> > > On Oct 2, 2025, at 7:55 PM, Sa Pham <[email protected]> wrote:
> > >
> > > Hi Joshua,
> > >
> > > No. OSD is still responding.
> > >
> > > # ceph tell osd.130 version
> > > {
> > >    "version": "18.2.7-0-g6b0e988052e",
> > >    "release": "reef",
> > >    "release_type": "stable"
> > > }
> > >
> > > But the primary OSD which includes slow pg (18.773) will respond
> slower.
> > >
> > >
> > > detailed as below
> > >
> > >
> > > # ceph pg dump_stuck degraded
> > > PG_STAT  STATE                                            UP
> > > UP_PRIMARY  ACTING     ACTING_PRIMARY
> > > 18.773   active+undersized+degraded+remapped+backfilling  [302,150,138]
> > >    302  [130,101]             130
> > >
> > >
> > > # time ceph tell osd.130 version
> > > {
> > >    "version": "18.2.7-0-g6b0e988052e",
> > >    "release": "reef",
> > >    "release_type": "stable"
> > > }
> > >
> > > real    0m2.113s
> > > user    0m0.148s
> > > sys     0m0.036s
> > > # time ceph tell osd.101 version
> > > {
> > >    "version": "18.2.7-0-g6b0e988052e",
> > >    "release": "reef",
> > >    "release_type": "stable"
> > > }
> > >
> > > real    0m0.192s
> > > user    0m0.152s
> > > sys     0m0.037s
> > >
> > >
> > > I don't know why.
> > >
> > >
> > > Regards,
> > >
> > >
> > >
> > > On Fri, Oct 3, 2025 at 4:21 AM Joshua Blanch <[email protected]>
> > > wrote:
> > >
> > >> could it be an OSD not responding?
> > >>
> > >> I would usually do
> > >>
> > >> ceph tell osd.* version
> > >>
> > >> To test if you can connect with osds
> > >>
> > >>
> > >> On Thu, Oct 2, 2025 at 12:09 PM Sa Pham <[email protected]> wrote:
> > >>
> > >>> *Hello everyone,*
> > >>>
> > >>> I’m running a Ceph cluster used as an RGW backend, and I’m facing an
> > issue
> > >>> with one particular placement group (PG).
> > >>>
> > >>>
> > >>>   -
> > >>>
> > >>>   Accessing objects from this PG is *extremely slow*.
> > >>>   -
> > >>>
> > >>>   Even running ceph pg <pg_id> takes a very long time.
> > >>>   -
> > >>>
> > >>>   The PG is currently *stuck in a degraded state*, so I’m unable to
> > move
> > >>>   it to other OSDs.
> > >>>
> > >>>
> > >>> Current ceph version is reef 18.2.7.
> > >>>
> > >>> Has anyone encountered a similar issue before or have any suggestions
> > on
> > >>> how to troubleshoot and resolve it?
> > >>>
> > >>>
> > >>> Thanks in advance!
> > >>> _______________________________________________
> > >>> ceph-users mailing list -- [email protected]
> > >>> To unsubscribe send an email to [email protected]
> > >>>
> > >>
> > >
> > > --
> > > Sa Pham Dang
> > > Skype: great_bn
> > > Phone/Telegram: 0986.849.582
> > > _______________________________________________
> > > ceph-users mailing list -- [email protected]
> > > To unsubscribe send an email to [email protected]
> >
> >
> _______________________________________________
> ceph-users mailing list -- [email protected]
> To unsubscribe send an email to [email protected]
>
_______________________________________________
ceph-users mailing list -- [email protected]
To unsubscribe send an email to [email protected]

[ceph-users] Re: Ceph PG stuck at degraded and very slow

Reply via email to