[ceph-users] Re: pg repair doesn't fix "got incorrect hash on read" / "candidate had an ec hash mismatch"

2024-03-06 Thread Kai Stian Olstad

Hi Eugen, thank you for the reply.

The OSD was drained over the weekend, so OSD 223 and 269 have only the 
problematic PG 404.bc.


I don't think moving the PG would help since I don't have any empty OSD 
to move it to, and a move would not fix the hash mismatch.
The reason I just want to have the problematic PG on the OSDs is to 
reduce recovery time.
I would need to set min_size to 4 in an EC 4+2, and stop them both at 
the same time to force a rebuild of the corrupted part of PG that is on 
osd 223 and 269, since repair doesn't fix it.


I'm debating with myself if I should
1. Stop both OSD 223 and 269,
2. Just one of them.

Stopping them both, I'm guarantied that part of the PG on 223 and 269 is 
rebuild from the 4 other, 297, 276, 136 and 197 that doesn't have any 
errors.


OSD 223 is the master in the EC, pg 404.bc acting 
[223,297,269,276,136,197]
So maybe just stop that one, wait for recovery and the run deep-scrub to 
check if things look better.

But would it then use corrupted data on osd 269 to rebuild.


-
Kai Stian Olstad



On 26.02.2024 10:19, Eugen Block wrote:

Hi,

I think your approach makes sense. But I'm wondering if moving only  
the problematic PGs to different OSDs could have an effect as well. I  
assume that moving the 2 PGs is much quicker than moving all BUT those  
2 PGs. If that doesn't work you could still fall back to draining the  
entire OSDs (except for the problematic PG).


Regards,
Eugen

Zitat von Kai Stian Olstad :


Hi,

No one have any comment at all?
I'm not picky so any speculation, guessing, I would, I wouldn't,  
should work and so one would be highly appreciated.



Since 4 out of 6 in EC 4+2 is OK and ceph pg repair doesn't solve it  
I think the following might work.


pg 404.bc acting [223,297,269,276,136,197]

- Use pgremapper to move all PG on OSD 223 and 269 except 404.bc to  
other OSD.
- Set min_since to 4, ceph osd pool set default.rgw.buckets.data 
min_size 4

- Stop osd 223 and 269

What I hope will happen is that Ceph then recreate 404.bc shard  
s0(osd.223) and s2(osd.269) since they are now down from the  
remaining shards

s1(osd.297), s3(osd.276), s4(osd.136) and s5(osd.197)


_Any_ comment is highly appreciated.

-
Kai Stian Olstad


On 21.02.2024 13:27, Kai Stian Olstad wrote:

Hi,

Short summary

PG 404.bc is an EC 4+2 where s0 and s2 report hash mismtach for 698 
objects.
Ceph pg repair doesn't fix it, because if you run deep-srub on the  
PG after repair is finished, it still report scrub errors.


Why can't ceph pg repair repair this, it has 4 out of 6 should be  
able to reconstruct the corrupted shards?
Is there a way to fix this? Like delete object s0 and s2 so it's  
forced to recreate them?



Long detailed summary

A short backstory.
* This is aftermath of problems with mclock, post "17.2.7:  
Backfilling deadlock / stall / stuck / standstill" [1].

 - 4 OSDs had a few bad sectors, set all 4 out and cluster stopped.
 - Solution was to swap from mclock to wpq and restart alle OSD.
 - When all backfilling was finished all 4 OSD was replaced.
 - osd.223 and osd.269 was 2 of the 4 OSDs that was replaced.


PG / pool 404 is EC 4+2 default.rgw.buckets.data

9 days after the osd.223 og osd.269 was replaced, deep-scub was run  
and reported errors

   ceph status
   ---
   HEALTH_ERR 1396 scrub errors; Possible data damage: 1 pg 
inconsistent

   [ERR] OSD_SCRUB_ERRORS: 1396 scrub errors
   [ERR] PG_DAMAGED: Possible data damage: 1 pg inconsistent
   pg 404.bc is active+clean+inconsistent, acting  
[223,297,269,276,136,197]


I then run repair
   ceph pg repair 404.bc

And ceph status showed this
   ceph status
   ---
   HEALTH_WARN Too many repaired reads on 2 OSDs
   [WRN] OSD_TOO_MANY_REPAIRS: Too many repaired reads on 2 OSDs
   osd.223 had 698 reads repaired
   osd.269 had 698 reads repaired

But osd.223 and osd.269 is new disks and the disks has no SMART  
error or any I/O error in OS logs.

So I tried to run deep-scrub again on the PG.
   ceph pg deep-scrub 404.bc

And got this result.

   ceph status
   ---
   HEALTH_ERR 1396 scrub errors; Too many repaired reads on 2 OSDs;  
Possible data damage: 1 pg inconsistent

   [ERR] OSD_SCRUB_ERRORS: 1396 scrub errors
   [WRN] OSD_TOO_MANY_REPAIRS: Too many repaired reads on 2 OSDs
   osd.223 had 698 reads repaired
   osd.269 had 698 reads repaired
   [ERR] PG_DAMAGED: Possible data damage: 1 pg inconsistent
   pg 404.bc is  active+clean+scrubbing+deep+inconsistent+repair, 
acting  [223,297,269,276,136,197]


698 + 698 = 1396 so the same amount of errors.

Run repair again on 404.bc and ceph status is

   HEALTH_WARN Too many repaired reads on 2 OSDs
   [WRN] OSD_TOO_MANY_REPAIRS: Too many repaired reads on 2 OSDs
   osd.223 had 1396 reads repaired
   osd.269 had 1396 reads repaired

So even when repair finish it doesn't fix the problem since they  
reappear again after a deep-scrub.


The log for osd.223 and osd.269 contain "got 

[ceph-users] Re: pg repair doesn't fix "got incorrect hash on read" / "candidate had an ec hash mismatch"

2024-02-28 Thread Eugen Block

Hi,


I'm debating with myself if I should
1. Stop both OSD 223 and 269,
2. Just one of them.


I understand your struggle, I think I would stop them both just to  
rule out a replication of corrupted data.


Zitat von Kai Stian Olstad :


Hi Eugen, thank you for the reply.

The OSD was drained over the weekend, so OSD 223 and 269 have only  
the problematic PG 404.bc.


I don't think moving the PG would help since I don't have any empty  
OSD to move it to, and a move would not fix the hash mismatch.
The reason I just want to have the problematic PG on the OSDs is to  
reduce recovery time.
I would need to set min_size to 4 in an EC 4+2, and stop them both  
at the same time to force a rebuild of the corrupted part of PG that  
is on osd 223 and 269, since repair doesn't fix it.


I'm debating with myself if I should
1. Stop both OSD 223 and 269,
2. Just one of them.

Stopping them both, I'm guarantied that part of the PG on 223 and  
269 is rebuild from the 4 other, 297, 276, 136 and 197 that doesn't  
have any errors.


OSD 223 is the master in the EC, pg 404.bc acting [223,297,269,276,136,197]
So maybe just stop that one, wait for recovery and the run  
deep-scrub to check if things look better.

But would it then use corrupted data on osd 269 to rebuild.


-
Kai Stian Olstad



On 26.02.2024 10:19, Eugen Block wrote:

Hi,

I think your approach makes sense. But I'm wondering if moving only  
 the problematic PGs to different OSDs could have an effect as  
well. I  assume that moving the 2 PGs is much quicker than moving  
all BUT those  2 PGs. If that doesn't work you could still fall  
back to draining the  entire OSDs (except for the problematic PG).


Regards,
Eugen

Zitat von Kai Stian Olstad :


Hi,

No one have any comment at all?
I'm not picky so any speculation, guessing, I would, I wouldn't,   
should work and so one would be highly appreciated.



Since 4 out of 6 in EC 4+2 is OK and ceph pg repair doesn't solve  
it  I think the following might work.


pg 404.bc acting [223,297,269,276,136,197]

- Use pgremapper to move all PG on OSD 223 and 269 except 404.bc  
to  other OSD.

- Set min_since to 4, ceph osd pool set default.rgw.buckets.data min_size 4
- Stop osd 223 and 269

What I hope will happen is that Ceph then recreate 404.bc shard   
s0(osd.223) and s2(osd.269) since they are now down from the   
remaining shards

s1(osd.297), s3(osd.276), s4(osd.136) and s5(osd.197)


_Any_ comment is highly appreciated.

-
Kai Stian Olstad


On 21.02.2024 13:27, Kai Stian Olstad wrote:

Hi,

Short summary

PG 404.bc is an EC 4+2 where s0 and s2 report hash mismtach for  
698 objects.
Ceph pg repair doesn't fix it, because if you run deep-srub on  
the  PG after repair is finished, it still report scrub errors.


Why can't ceph pg repair repair this, it has 4 out of 6 should be  
 able to reconstruct the corrupted shards?
Is there a way to fix this? Like delete object s0 and s2 so it's   
forced to recreate them?



Long detailed summary

A short backstory.
* This is aftermath of problems with mclock, post "17.2.7:   
Backfilling deadlock / stall / stuck / standstill" [1].

- 4 OSDs had a few bad sectors, set all 4 out and cluster stopped.
- Solution was to swap from mclock to wpq and restart alle OSD.
- When all backfilling was finished all 4 OSD was replaced.
- osd.223 and osd.269 was 2 of the 4 OSDs that was replaced.


PG / pool 404 is EC 4+2 default.rgw.buckets.data

9 days after the osd.223 og osd.269 was replaced, deep-scub was  
run  and reported errors

  ceph status
  ---
  HEALTH_ERR 1396 scrub errors; Possible data damage: 1 pg inconsistent
  [ERR] OSD_SCRUB_ERRORS: 1396 scrub errors
  [ERR] PG_DAMAGED: Possible data damage: 1 pg inconsistent
  pg 404.bc is active+clean+inconsistent, acting   
[223,297,269,276,136,197]


I then run repair
  ceph pg repair 404.bc

And ceph status showed this
  ceph status
  ---
  HEALTH_WARN Too many repaired reads on 2 OSDs
  [WRN] OSD_TOO_MANY_REPAIRS: Too many repaired reads on 2 OSDs
  osd.223 had 698 reads repaired
  osd.269 had 698 reads repaired

But osd.223 and osd.269 is new disks and the disks has no SMART   
error or any I/O error in OS logs.

So I tried to run deep-scrub again on the PG.
  ceph pg deep-scrub 404.bc

And got this result.

  ceph status
  ---
  HEALTH_ERR 1396 scrub errors; Too many repaired reads on 2  
OSDs;  Possible data damage: 1 pg inconsistent

  [ERR] OSD_SCRUB_ERRORS: 1396 scrub errors
  [WRN] OSD_TOO_MANY_REPAIRS: Too many repaired reads on 2 OSDs
  osd.223 had 698 reads repaired
  osd.269 had 698 reads repaired
  [ERR] PG_DAMAGED: Possible data damage: 1 pg inconsistent
  pg 404.bc is   
active+clean+scrubbing+deep+inconsistent+repair, acting   
[223,297,269,276,136,197]


698 + 698 = 1396 so the same amount of errors.

Run repair again on 404.bc and ceph status is

  HEALTH_WARN Too many repaired reads on 2 OSDs
  [WRN] OSD_TOO_MANY_REPAIRS: Too many repaired reads on 

[ceph-users] Re: pg repair doesn't fix "got incorrect hash on read" / "candidate had an ec hash mismatch"

2024-02-27 Thread Kai Stian Olstad

Hi Eugen, thank you for the reply.

The OSD was drained over the weekend, so OSD 223 and 269 have only the 
problematic PG 404.bc.


I don't think moving the PG would help since I don't have any empty OSD 
to move it to, and a move would not fix the hash mismatch.
The reason I just want to have the problematic PG on the OSDs is to 
reduce recovery time.
I would need to set min_size to 4 in an EC 4+2, and stop them both at 
the same time to force a rebuild of the corrupted part of PG that is on 
osd 223 and 269, since repair doesn't fix it.


I'm debating with myself if I should
1. Stop both OSD 223 and 269,
2. Just one of them.

Stopping them both, I'm guarantied that part of the PG on 223 and 269 is 
rebuild from the 4 other, 297, 276, 136 and 197 that doesn't have any 
errors.


OSD 223 is the master in the EC, pg 404.bc acting 
[223,297,269,276,136,197]
So maybe just stop that one, wait for recovery and the run deep-scrub to 
check if things look better.

But would it then use corrupted data on osd 269 to rebuild.


-
Kai Stian Olstad



On 26.02.2024 10:19, Eugen Block wrote:

Hi,

I think your approach makes sense. But I'm wondering if moving only  
the problematic PGs to different OSDs could have an effect as well. I  
assume that moving the 2 PGs is much quicker than moving all BUT those  
2 PGs. If that doesn't work you could still fall back to draining the  
entire OSDs (except for the problematic PG).


Regards,
Eugen

Zitat von Kai Stian Olstad :


Hi,

No one have any comment at all?
I'm not picky so any speculation, guessing, I would, I wouldn't,  
should work and so one would be highly appreciated.



Since 4 out of 6 in EC 4+2 is OK and ceph pg repair doesn't solve it  
I think the following might work.


pg 404.bc acting [223,297,269,276,136,197]

- Use pgremapper to move all PG on OSD 223 and 269 except 404.bc to  
other OSD.
- Set min_since to 4, ceph osd pool set default.rgw.buckets.data 
min_size 4

- Stop osd 223 and 269

What I hope will happen is that Ceph then recreate 404.bc shard  
s0(osd.223) and s2(osd.269) since they are now down from the  
remaining shards

s1(osd.297), s3(osd.276), s4(osd.136) and s5(osd.197)


_Any_ comment is highly appreciated.

-
Kai Stian Olstad


On 21.02.2024 13:27, Kai Stian Olstad wrote:

Hi,

Short summary

PG 404.bc is an EC 4+2 where s0 and s2 report hash mismtach for 698 
objects.
Ceph pg repair doesn't fix it, because if you run deep-srub on the  
PG after repair is finished, it still report scrub errors.


Why can't ceph pg repair repair this, it has 4 out of 6 should be  
able to reconstruct the corrupted shards?
Is there a way to fix this? Like delete object s0 and s2 so it's  
forced to recreate them?



Long detailed summary

A short backstory.
* This is aftermath of problems with mclock, post "17.2.7:  
Backfilling deadlock / stall / stuck / standstill" [1].

 - 4 OSDs had a few bad sectors, set all 4 out and cluster stopped.
 - Solution was to swap from mclock to wpq and restart alle OSD.
 - When all backfilling was finished all 4 OSD was replaced.
 - osd.223 and osd.269 was 2 of the 4 OSDs that was replaced.


PG / pool 404 is EC 4+2 default.rgw.buckets.data

9 days after the osd.223 og osd.269 was replaced, deep-scub was run  
and reported errors

   ceph status
   ---
   HEALTH_ERR 1396 scrub errors; Possible data damage: 1 pg 
inconsistent

   [ERR] OSD_SCRUB_ERRORS: 1396 scrub errors
   [ERR] PG_DAMAGED: Possible data damage: 1 pg inconsistent
   pg 404.bc is active+clean+inconsistent, acting  
[223,297,269,276,136,197]


I then run repair
   ceph pg repair 404.bc

And ceph status showed this
   ceph status
   ---
   HEALTH_WARN Too many repaired reads on 2 OSDs
   [WRN] OSD_TOO_MANY_REPAIRS: Too many repaired reads on 2 OSDs
   osd.223 had 698 reads repaired
   osd.269 had 698 reads repaired

But osd.223 and osd.269 is new disks and the disks has no SMART  
error or any I/O error in OS logs.

So I tried to run deep-scrub again on the PG.
   ceph pg deep-scrub 404.bc

And got this result.

   ceph status
   ---
   HEALTH_ERR 1396 scrub errors; Too many repaired reads on 2 OSDs;  
Possible data damage: 1 pg inconsistent

   [ERR] OSD_SCRUB_ERRORS: 1396 scrub errors
   [WRN] OSD_TOO_MANY_REPAIRS: Too many repaired reads on 2 OSDs
   osd.223 had 698 reads repaired
   osd.269 had 698 reads repaired
   [ERR] PG_DAMAGED: Possible data damage: 1 pg inconsistent
   pg 404.bc is  active+clean+scrubbing+deep+inconsistent+repair, 
acting  [223,297,269,276,136,197]


698 + 698 = 1396 so the same amount of errors.

Run repair again on 404.bc and ceph status is

   HEALTH_WARN Too many repaired reads on 2 OSDs
   [WRN] OSD_TOO_MANY_REPAIRS: Too many repaired reads on 2 OSDs
   osd.223 had 1396 reads repaired
   osd.269 had 1396 reads repaired

So even when repair finish it doesn't fix the problem since they  
reappear again after a deep-scrub.


The log for osd.223 and osd.269 contain "got 

[ceph-users] Re: pg repair doesn't fix "got incorrect hash on read" / "candidate had an ec hash mismatch"

2024-02-26 Thread Eugen Block

Hi,

I think your approach makes sense. But I'm wondering if moving only  
the problematic PGs to different OSDs could have an effect as well. I  
assume that moving the 2 PGs is much quicker than moving all BUT those  
2 PGs. If that doesn't work you could still fall back to draining the  
entire OSDs (except for the problematic PG).


Regards,
Eugen

Zitat von Kai Stian Olstad :


Hi,

No one have any comment at all?
I'm not picky so any speculation, guessing, I would, I wouldn't,  
should work and so one would be highly appreciated.



Since 4 out of 6 in EC 4+2 is OK and ceph pg repair doesn't solve it  
I think the following might work.


pg 404.bc acting [223,297,269,276,136,197]

- Use pgremapper to move all PG on OSD 223 and 269 except 404.bc to  
other OSD.

- Set min_since to 4, ceph osd pool set default.rgw.buckets.data min_size 4
- Stop osd 223 and 269

What I hope will happen is that Ceph then recreate 404.bc shard  
s0(osd.223) and s2(osd.269) since they are now down from the  
remaining shards

s1(osd.297), s3(osd.276), s4(osd.136) and s5(osd.197)


_Any_ comment is highly appreciated.

-
Kai Stian Olstad


On 21.02.2024 13:27, Kai Stian Olstad wrote:

Hi,

Short summary

PG 404.bc is an EC 4+2 where s0 and s2 report hash mismtach for 698 objects.
Ceph pg repair doesn't fix it, because if you run deep-srub on the  
PG after repair is finished, it still report scrub errors.


Why can't ceph pg repair repair this, it has 4 out of 6 should be  
able to reconstruct the corrupted shards?
Is there a way to fix this? Like delete object s0 and s2 so it's  
forced to recreate them?



Long detailed summary

A short backstory.
* This is aftermath of problems with mclock, post "17.2.7:  
Backfilling deadlock / stall / stuck / standstill" [1].

 - 4 OSDs had a few bad sectors, set all 4 out and cluster stopped.
 - Solution was to swap from mclock to wpq and restart alle OSD.
 - When all backfilling was finished all 4 OSD was replaced.
 - osd.223 and osd.269 was 2 of the 4 OSDs that was replaced.


PG / pool 404 is EC 4+2 default.rgw.buckets.data

9 days after the osd.223 og osd.269 was replaced, deep-scub was run  
and reported errors

   ceph status
   ---
   HEALTH_ERR 1396 scrub errors; Possible data damage: 1 pg inconsistent
   [ERR] OSD_SCRUB_ERRORS: 1396 scrub errors
   [ERR] PG_DAMAGED: Possible data damage: 1 pg inconsistent
   pg 404.bc is active+clean+inconsistent, acting  
[223,297,269,276,136,197]


I then run repair
   ceph pg repair 404.bc

And ceph status showed this
   ceph status
   ---
   HEALTH_WARN Too many repaired reads on 2 OSDs
   [WRN] OSD_TOO_MANY_REPAIRS: Too many repaired reads on 2 OSDs
   osd.223 had 698 reads repaired
   osd.269 had 698 reads repaired

But osd.223 and osd.269 is new disks and the disks has no SMART  
error or any I/O error in OS logs.

So I tried to run deep-scrub again on the PG.
   ceph pg deep-scrub 404.bc

And got this result.

   ceph status
   ---
   HEALTH_ERR 1396 scrub errors; Too many repaired reads on 2 OSDs;  
Possible data damage: 1 pg inconsistent

   [ERR] OSD_SCRUB_ERRORS: 1396 scrub errors
   [WRN] OSD_TOO_MANY_REPAIRS: Too many repaired reads on 2 OSDs
   osd.223 had 698 reads repaired
   osd.269 had 698 reads repaired
   [ERR] PG_DAMAGED: Possible data damage: 1 pg inconsistent
   pg 404.bc is  
active+clean+scrubbing+deep+inconsistent+repair, acting  
[223,297,269,276,136,197]


698 + 698 = 1396 so the same amount of errors.

Run repair again on 404.bc and ceph status is

   HEALTH_WARN Too many repaired reads on 2 OSDs
   [WRN] OSD_TOO_MANY_REPAIRS: Too many repaired reads on 2 OSDs
   osd.223 had 1396 reads repaired
   osd.269 had 1396 reads repaired

So even when repair finish it doesn't fix the problem since they  
reappear again after a deep-scrub.


The log for osd.223 and osd.269 contain "got incorrect hash on  
read" and "candidate had an ec hash mismatch" for 698 unique objects.
But i only show the logs for 1 of the 698 object, the log is the  
same for the other 697 objects.


   osd.223 log (only showing 1 of 698 object named  
2021-11-08T19%3a43%3a50,145489260+00%3a00)

   ---
   Feb 20 10:31:00 ceph-hd-003 ceph-osd[3665432]: osd.223 pg_epoch:  
231235 pg[404.bcs0( v 231235'1636919  
(231078'1632435,231235'1636919] local-lis/les=226263/226264  
n=296580 ec=36041/27862 lis/c=226263/226263 les/c/f=226264/230954/0  
sis=226263) [223,297,269,276,136,197]p223(0) r=0 lpr=226263  
crt=231235'1636919 lcod 231235'1636918 mlcod 231235'1636918  
active+clean+scrubbing+deep+inconsistent+repair [ 404.bcs0:   
REQ_SCRUB ]  MUST_REPAIR MUST_DEEP_SCRUB MUST_SCRUB planned  
REQ_SCRUB] _scan_list   
404:3d001f95:::1f244892-a2e7-406b-aa62-1b13511333a2.625411.3__multipart_2021-11-08T19%3a43%3a50,145489260+00%3a00.2~OoetD5vkh8fyh-2eeR7GF5rZK7d5EVa.1:head got incorrect hash on read 0xc5d1dd1b !=  expected  
0x7c2f86d7
   Feb 20 10:31:01 ceph-hd-003 ceph-osd[3665432]:  

[ceph-users] Re: pg repair doesn't fix "got incorrect hash on read" / "candidate had an ec hash mismatch"

2024-02-23 Thread Kai Stian Olstad

Hi,

No one have any comment at all?
I'm not picky so any speculation, guessing, I would, I wouldn't, should 
work and so one would be highly appreciated.



Since 4 out of 6 in EC 4+2 is OK and ceph pg repair doesn't solve it I 
think the following might work.


pg 404.bc acting [223,297,269,276,136,197]

- Use pgremapper to move all PG on OSD 223 and 269 except 404.bc to 
other OSD.
- Set min_since to 4, ceph osd pool set default.rgw.buckets.data 
min_size 4

- Stop osd 223 and 269

What I hope will happen is that Ceph then recreate 404.bc shard 
s0(osd.223) and s2(osd.269) since they are now down from the remaining 
shards

s1(osd.297), s3(osd.276), s4(osd.136) and s5(osd.197)


_Any_ comment is highly appreciated.

-
Kai Stian Olstad


On 21.02.2024 13:27, Kai Stian Olstad wrote:

Hi,

Short summary

PG 404.bc is an EC 4+2 where s0 and s2 report hash mismtach for 698 
objects.
Ceph pg repair doesn't fix it, because if you run deep-srub on the PG 
after repair is finished, it still report scrub errors.


Why can't ceph pg repair repair this, it has 4 out of 6 should be able 
to reconstruct the corrupted shards?
Is there a way to fix this? Like delete object s0 and s2 so it's forced 
to recreate them?



Long detailed summary

A short backstory.
* This is aftermath of problems with mclock, post "17.2.7: Backfilling 
deadlock / stall / stuck / standstill" [1].

  - 4 OSDs had a few bad sectors, set all 4 out and cluster stopped.
  - Solution was to swap from mclock to wpq and restart alle OSD.
  - When all backfilling was finished all 4 OSD was replaced.
  - osd.223 and osd.269 was 2 of the 4 OSDs that was replaced.


PG / pool 404 is EC 4+2 default.rgw.buckets.data

9 days after the osd.223 og osd.269 was replaced, deep-scub was run and 
reported errors

ceph status
---
HEALTH_ERR 1396 scrub errors; Possible data damage: 1 pg 
inconsistent

[ERR] OSD_SCRUB_ERRORS: 1396 scrub errors
[ERR] PG_DAMAGED: Possible data damage: 1 pg inconsistent
pg 404.bc is active+clean+inconsistent, acting 
[223,297,269,276,136,197]


I then run repair
ceph pg repair 404.bc

And ceph status showed this
ceph status
---
HEALTH_WARN Too many repaired reads on 2 OSDs
[WRN] OSD_TOO_MANY_REPAIRS: Too many repaired reads on 2 OSDs
osd.223 had 698 reads repaired
osd.269 had 698 reads repaired

But osd.223 and osd.269 is new disks and the disks has no SMART error 
or any I/O error in OS logs.

So I tried to run deep-scrub again on the PG.
ceph pg deep-scrub 404.bc

And got this result.

ceph status
---
HEALTH_ERR 1396 scrub errors; Too many repaired reads on 2 OSDs; 
Possible data damage: 1 pg inconsistent

[ERR] OSD_SCRUB_ERRORS: 1396 scrub errors
[WRN] OSD_TOO_MANY_REPAIRS: Too many repaired reads on 2 OSDs
osd.223 had 698 reads repaired
osd.269 had 698 reads repaired
[ERR] PG_DAMAGED: Possible data damage: 1 pg inconsistent
pg 404.bc is active+clean+scrubbing+deep+inconsistent+repair, 
acting [223,297,269,276,136,197]


698 + 698 = 1396 so the same amount of errors.

Run repair again on 404.bc and ceph status is

HEALTH_WARN Too many repaired reads on 2 OSDs
[WRN] OSD_TOO_MANY_REPAIRS: Too many repaired reads on 2 OSDs
osd.223 had 1396 reads repaired
osd.269 had 1396 reads repaired

So even when repair finish it doesn't fix the problem since they 
reappear again after a deep-scrub.


The log for osd.223 and osd.269 contain "got incorrect hash on read" 
and "candidate had an ec hash mismatch" for 698 unique objects.
But i only show the logs for 1 of the 698 object, the log is the same 
for the other 697 objects.


osd.223 log (only showing 1 of 698 object named 
2021-11-08T19%3a43%3a50,145489260+00%3a00)

---
Feb 20 10:31:00 ceph-hd-003 ceph-osd[3665432]: osd.223 pg_epoch: 
231235 pg[404.bcs0( v 231235'1636919 (231078'1632435,231235'1636919] 
local-lis/les=226263/226264 n=296580 ec=36041/27862 lis/c=226263/226263 
les/c/f=226264/230954/0 sis=226263) [223,297,269,276,136,197]p223(0) 
r=0 lpr=226263 crt=231235'1636919 lcod 231235'1636918 mlcod 
231235'1636918 active+clean+scrubbing+deep+inconsistent+repair [ 
404.bcs0:  REQ_SCRUB ]  MUST_REPAIR MUST_DEEP_SCRUB MUST_SCRUB planned 
REQ_SCRUB] _scan_list  
404:3d001f95:::1f244892-a2e7-406b-aa62-1b13511333a2.625411.3__multipart_2021-11-08T19%3a43%3a50,145489260+00%3a00.2~OoetD5vkh8fyh-2eeR7GF5rZK7d5EVa.1:head 
got incorrect hash on read 0xc5d1dd1b !=  expected 0x7c2f86d7
Feb 20 10:31:01 ceph-hd-003 ceph-osd[3665432]: log_channel(cluster) 
log [ERR] : 404.bc shard 223(0) soid 
404:3d001f95:::1f244892-a2e7-406b-aa62-1b13511333a2.625411.3__multipart_2021-11-08T19%3a43%3a50,145489260+00%3a00.2~OoetD5vkh8fyh-2eeR7GF5rZK7d5EVa.1:head 
: candidate had an ec hash mismatch
Feb 20 10:31:01 ceph-hd-003 ceph-osd[3665432]: log_channel(cluster) 
log [ERR] : 404.bc shard 269(2) soid