[ceph-users] Re: Unable to fix 1 Inconsistent PG

2023-10-11 Thread Wesley Dillingham
Just to be clear, you should remove the osd by stopping the daemon and marking it out before you repair the PG. The pg may not be able to be repaired until you remove the bad disk. 1 - identify the bad disk (via scrubs or SMART/dmesg inspection) 2 - stop daemon and mark it out 3 - wait for PG to

[ceph-users] Re: Unable to fix 1 Inconsistent PG

2023-10-11 Thread Wesley Dillingham
If I recall correctly When the acting or up_set of an PG changes the scrub information is lost. This was likely lost when you stopped osd.238 and changed the sets. I do not believe based on your initial post you need to be using the objectstore tool currently. Inconsistent PGs are a common

[ceph-users] Re: Unable to fix 1 Inconsistent PG

2023-10-11 Thread Siddhit Renake
Hello Wes, Thank you for your response. brc1admin:~ # rados list-inconsistent-obj 15.f4f No scrub information available for pg 15.f4f brc1admin:~ # ceph osd ok-to-stop osd.238 OSD(s) 238 are ok to stop without reducing availability or risking data, provided there are no other concurrent

[ceph-users] Re: Unable to fix 1 Inconsistent PG

2023-10-10 Thread Wesley Dillingham
In case it's not obvious I forgot a space: "rados list-inconsistent-obj 15.f4f" Respectfully, *Wes Dillingham* w...@wesdillingham.com LinkedIn On Tue, Oct 10, 2023 at 4:55 PM Wesley Dillingham wrote: > You likely have a failing disk, what does

[ceph-users] Re: Unable to fix 1 Inconsistent PG

2023-10-10 Thread Wesley Dillingham
You likely have a failing disk, what does "rados list-inconsistent-obj15.f4f" return? It should identify the failing osd. Assuming "ceph osd ok-to-stop " returns in the affirmative for that osd, you likely need to stop the associated osd daemon, then mark it out "ceph osd out wait for it to