I've tried a few more things to get a deep-scrub going on my PG. I tried instructing the involved osds to scrub all their PGs and it looks like that didn't do it.
Do you have any documentation on the object-store-tool? What I've found online talks about filestore and not bluestore. On 6 April 2018 at 09:27, David Turner <drakonst...@gmail.com> wrote: > I'm running into this exact same situation. I'm running 12.2.2 and I have > an EC PG with a scrub error. It has the same output for [1] rados > list-inconsistent-obj as mentioned before. This is the [2] full health > detail. This is the [3] excerpt from the log from the deep-scrub that > marked the PG inconsistent. The scrub happened when the PG was starting up > after using ceph-objectstore-tool to split its filestore subfolders. This > is using a script that I've used for months without any side effects. > > I have tried quite a few things to get this PG to deep-scrub or repair, > but to no avail. It will not do anything. I have set every osd's > osd_max_scrubs to 0 in the cluster, waited for all scrubbing and deep > scrubbing to finish, then increased the 11 OSDs for this PG to 1 before > issuing a deep-scrub. And it will sit there for over an hour without > deep-scrubbing. My current testing of this is to set all osds to 1, > increase all of the osds for this PG to 4, and then issue the repair... but > similarly nothing happens. Each time I issue the deep-scrub or repair, the > output correctly says 'instructing pg 145.2e3 on osd.234 to repair', but > nothing shows up in the log for the OSD and the PG state stays > 'active+clean+inconsistent'. > > My next step, unless anyone has a better idea, is to find the exact copy > of the PG with the missing object, use object-store-tool to back up that > copy of the PG and remove it. Then starting the OSD back up should > backfill the full copy of the PG and be healthy again. > > > > [1] $ rados list-inconsistent-obj 145.2e3 > No scrub information available for pg 145.2e3 > error 2: (2) No such file or directory > > [2] $ ceph health detail > HEALTH_ERR 1 scrub errors; Possible data damage: 1 pg inconsistent > OSD_SCRUB_ERRORS 1 scrub errors > PG_DAMAGED Possible data damage: 1 pg inconsistent > pg 145.2e3 is active+clean+inconsistent, acting > [234,132,33,331,278,217,55,358,79,3,24] > > [3] 2018-04-04 15:24:53.603380 7f54d1820700 0 log_channel(cluster) log > [DBG] : 145.2e3 deep-scrub starts > 2018-04-04 17:32:37.916853 7f54d1820700 -1 log_channel(cluster) log [ERR] > : 145.2e3s0 deep-scrub 1 missing, 0 inconsistent objects > 2018-04-04 17:32:37.916865 7f54d1820700 -1 log_channel(cluster) log [ERR] > : 145.2e3 deep-scrub 1 errors > > On Mon, Apr 2, 2018 at 4:51 PM Michael Sudnick <michael.sudn...@gmail.com> > wrote: > >> Hi Kjetil, >> >> I've tried to get the pg scrubbing/deep scrubbing and nothing seems to be >> happening. I've tried it a few times over the last few days. My cluster is >> recovering from a failed disk (which was probably the reason for the >> inconsistency), do I need to wait for the cluster to heal before >> repair/deep scrub works? >> >> -Michael >> >> On 2 April 2018 at 14:13, Kjetil Joergensen <kje...@medallia.com> wrote: >> >>> Hi, >>> >>> scrub or deep-scrub the pg, that should in theory get you back to >>> list-inconsistent-obj spitting out what's wrong, then mail that info to the >>> list. >>> >>> -KJ >>> >>> On Sun, Apr 1, 2018 at 9:17 AM, Michael Sudnick < >>> michael.sudn...@gmail.com> wrote: >>> >>>> Hello, >>>> >>>> I have a small cluster with an inconsistent pg. I've tried ceph pg >>>> repair multiple times to no luck. rados list-inconsistent-obj 49.11c >>>> returns: >>>> >>>> # rados list-inconsistent-obj 49.11c >>>> No scrub information available for pg 49.11c >>>> error 2: (2) No such file or directory >>>> >>>> I'm a bit at a loss here as what to do to recover. That pg is part of a >>>> cephfs_data pool with compression set to force/snappy. >>>> >>>> Does anyone have an suggestions? >>>> >>>> -Michael >>>> >>>> _______________________________________________ >>>> ceph-users mailing list >>>> ceph-users@lists.ceph.com >>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >>>> >>>> >>> >>> >>> -- >>> Kjetil Joergensen <kje...@medallia.com> >>> SRE, Medallia Inc >>> >> >> _______________________________________________ >> ceph-users mailing list >> ceph-users@lists.ceph.com >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> >
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com