Yes, primary OSD. Extracted with grep -e scrub -e repair -e 19.1fff 
/var/log/ceph/ceph-osd.338.log and then only relevant lines copied.

Yes, according to the case I should just run a deep-scrub and should see. I 
guess if this error was cleared on an aborted repair, this would be a new bug? 
I will do a deep-scrub and report back.

Best regards,
=================
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14

________________________________________
From: Dan van der Ster <dvand...@gmail.com>
Sent: 08 October 2022 11:18:37
To: Frank Schilder
Cc: Ceph Users
Subject: Re: [ceph-users] recurring stat mismatch on PG

Is that the log from the primary OSD?

About the restart, you should probably just deep-scrub again to see the current 
state.


.. Dan



On Sat, Oct 8, 2022, 11:14 Frank Schilder <fr...@dtu.dk<mailto:fr...@dtu.dk>> 
wrote:
Hi Dan,

yes, 15.2.17. I remember that case and was expecting it to be fixed. Here a 
relevant extract from the log:

2022-10-08T10:06:22.206+0200 7fa3c48c7700  0 log_channel(cluster) log [DBG] : 
19.1fff deep-scrub starts
2022-10-08T10:22:33.049+0200 7fa3c48c7700 -1 log_channel(cluster) log [ERR] : 
19.1fffs0 deep-scrub : stat mismatch, got 64532/64531 objects, 1243/1243 
clones, 64532/64531 dirty, 0/0 omap, 0/0 pinned, 0/0 hit_set_archive, 1215/1215 
whiteouts, 170978253582/170974059278 bytes, 0/0 manifest objects, 0/0 
hit_set_archive bytes.
2022-10-08T10:22:33.049+0200 7fa3c48c7700 -1 log_channel(cluster) log [ERR] : 
19.1fff deep-scrub 1 errors
2022-10-08T10:38:20.618+0200 7fa3c48c7700  0 log_channel(cluster) log [DBG] : 
19.1fff repair starts
2022-10-08T10:54:25.801+0200 7fa3c48c7700 -1 log_channel(cluster) log [ERR] : 
19.1fffs0 repair : stat mismatch, got 64532/64531 objects, 1243/1243 clones, 
64532/64531 dirty, 0/0 omap, 0/0 pinned, 0/0 hit_set_archive, 1215/1215 
whiteouts, 170978253582/170974059278 bytes, 0/0 manifest objects, 0/0 
hit_set_archive bytes.
2022-10-08T10:54:25.802+0200 7fa3c48c7700 -1 log_channel(cluster) log [ERR] : 
19.1fff repair 1 errors, 1 fixed

Just completed a repair and its gone for now. As an alternative explanation, we 
had this scrub error, I started a repair but then OSDs in that PG were shut 
down and restarted. Is it possible that the repair was cancelled and the error 
cleared erroneously?

Thanks and best regards,
=================
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14

________________________________________
From: Dan van der Ster <dvand...@gmail.com<mailto:dvand...@gmail.com>>
Sent: 08 October 2022 11:03:05
To: Frank Schilder
Cc: Ceph Users
Subject: Re: [ceph-users] recurring stat mismatch on PG

Hi,

Is that 15.2.17? It reminds me of this bug - 
https://tracker.ceph.com/issues/52705 - where an object with a particular name 
would hash to ffffffff and cause a stat mismatch during scrub. But 15.2.17 
should have the fix for that.


Can you find the relevant osd log for more info?

.. Dan



On Sat, Oct 8, 2022, 10:42 Frank Schilder 
<fr...@dtu.dk<mailto:fr...@dtu.dk><mailto:fr...@dtu.dk<mailto:fr...@dtu.dk>>> 
wrote:
Hi all,

I seem to observe something strange on an octopus(latest) cluster. We have a PG 
with a stat mismatch:

2022-10-08T10:06:22.206+0200 7fa3c48c7700  0 log_channel(cluster) log [DBG] : 
19.1fff deep-scrub starts
2022-10-08T10:22:33.049+0200 7fa3c48c7700 -1 log_channel(cluster) log [ERR] : 
19.1fffs0 deep-scrub : stat mismatch, got 64532/64531 objects, 1243/1243 
clones, 64532/64531 dirty, 0/0 omap, 0/0 pinned, 0/0 hit_set_archive, 1215/1215 
whiteouts, 170978253582/170974059278 bytes, 0/0 manifest objects, 0/0 
hit_set_archive bytes.
2022-10-08T10:22:33.049+0200 7fa3c48c7700 -1 log_channel(cluster) log [ERR] : 
19.1fff deep-scrub 1 errors

This exact same mismatch was found before and I executed a pg-repair that fixed 
it. Now its back. Does anyone have an idea why this might be happening and how 
to deal with it?

Thanks!
=================
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
_______________________________________________
ceph-users mailing list -- 
ceph-users@ceph.io<mailto:ceph-users@ceph.io><mailto:ceph-users@ceph.io<mailto:ceph-users@ceph.io>>
To unsubscribe send an email to 
ceph-users-le...@ceph.io<mailto:ceph-users-le...@ceph.io><mailto:ceph-users-le...@ceph.io<mailto:ceph-users-le...@ceph.io>>
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

Reply via email to