[ceph-users] Re: Testing CEPH scrubbing / self-healing capabilities

2024-06-13 Thread Frédéric Nass
Hello,

'ceph osd deep-scrub 5' deep-scrubs all PGs for which osd.5 is primary (and 
only those).

You can check that from ceph-osd.5.log by running:
for pg in $(grep 'deep-scrub starts' /var/log/ceph/*/ceph-osd.5.log | awk 
'{print $8}') ; do echo "pg: $pg, primary osd is osd.$(ceph pg $pg query -f 
json | jq '.info.stats.acting_primary')" ; done

while

'ceph osd deep-scrub all' instructs all OSDs to start deep-scrubbing all PGs 
they're primary for, so in the end, all cluster's PGs.

So if the data you overwrote on osd.5 with 'dd' was part of a PG for which 
osd.5 was not the primary OSD then it wasn't deep-scrubbed.

man ceph 8 could rather say:

   Subcommand deep-scrub initiates deep scrub on all PGs osd  is 
primary for.

   Usage:

  ceph osd deep-scrub 

Regards,
Frédéric.

- Le 10 Juin 24, à 16:51, Petr Bena petr@bena.rocks a écrit :

> Most likely it wasn't, the ceph help or documentation is not very clear about
> this:
> 
> osd deep-scrub 
> initiate
> deep scrub on osd , or use  to deep scrub all
> 
> It doesn't say anything like "initiate deep scrub of primary PGs on osd"
> 
> I assumed it just runs a scrub of everything on given OSD.
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Testing CEPH scrubbing / self-healing capabilities

2024-06-10 Thread Anthony D'Atri
Scrubs are of PGs not OSDs, the lead OSD for a PG orchestrates subops to 
secondary OSDs.  If you can point me to where this is in docs/src I'll clarify 
it, ideally if you can put in a tracker ticket and send me a link.

Scrubbing all PGs on an OSD at once or even in sequence would be impactful.

> On Jun 10, 2024, at 10:51, Petr Bena  wrote:
> 
> Most likely it wasn't, the ceph help or documentation is not very clear about 
> this:
> 
> osd deep-scrub   
> initiate deep scrub on osd , or 
> use  to deep scrub all
> 
> It doesn't say anything like "initiate deep scrub of primary PGs on osd"
> 
> I assumed it just runs a scrub of everything on given OSD.
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Testing CEPH scrubbing / self-healing capabilities

2024-06-10 Thread Petr Bena
Most likely it wasn't, the ceph help or documentation is not very clear about 
this:

osd deep-scrub 
  initiate deep scrub on osd , or use 
 to deep scrub all

It doesn't say anything like "initiate deep scrub of primary PGs on osd"

I assumed it just runs a scrub of everything on given OSD.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Testing CEPH scrubbing / self-healing capabilities

2024-06-10 Thread Eugen Block
That would have been my next question, did you verify that the  
corrupted OSD was a primary? The default deep-scrub config scrubs all  
PGs within a week, so yeah, it can take a week until it's detected. It  
could have been detected sooner if those objects would have been in  
use by clients and required to be updated (rewritten).


Zitat von Petr Bena :


Hello,

No I don't have osd_scrub_auto_repair, interestingly after about a  
week after forgetting about this, an error manifested:


[ERR] OSD_SCRUB_ERRORS: 1 scrub errors
[ERR] PG_DAMAGED: Possible data damage: 1 pg inconsistent
pg 4.1d is active+clean+inconsistent, acting [4,2]

which could be repaired as expected, since I damaged only 1 OSD.  
It's interesting it took a whole week to find it. For some reason it  
seems to be that running deep-scrub on entire OSD only runs it for  
PGs where the OSD is considered "primary", so maybe that's why it  
wasn't detected when I ran it manually?

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io



___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Testing CEPH scrubbing / self-healing capabilities

2024-06-10 Thread Petr Bena
Hello,

No I don't have osd_scrub_auto_repair, interestingly after about a week after 
forgetting about this, an error manifested:

[ERR] OSD_SCRUB_ERRORS: 1 scrub errors
[ERR] PG_DAMAGED: Possible data damage: 1 pg inconsistent
pg 4.1d is active+clean+inconsistent, acting [4,2]

which could be repaired as expected, since I damaged only 1 OSD. It's 
interesting it took a whole week to find it. For some reason it seems to be 
that running deep-scrub on entire OSD only runs it for PGs where the OSD is 
considered "primary", so maybe that's why it wasn't detected when I ran it 
manually?
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Testing CEPH scrubbing / self-healing capabilities

2024-06-07 Thread Frédéric Nass
Hello Petr,

- Le 4 Juin 24, à 12:13, Petr Bena petr@bena.rocks a écrit :

> Hello,
> 
> I wanted to try out (lab ceph setup) what exactly is going to happen
> when parts of data on OSD disk gets corrupted. I created a simple test
> where I was going through the block device data until I found something
> that resembled user data (using dd and hexdump) (/dev/sdd is a block
> device that is used by OSD)
> 
> INFRA [root@ceph-vm-lab5 ~]# dd if=/dev/sdd bs=32 count=1 skip=33920 |
> hexdump -C
>   6e 20 69 64 3d 30 20 65  78 65 3d 22 2f 75 73 72  |n id=0
> exe="/usr|
> 0010  2f 73 62 69 6e 2f 73 73  68 64 22 20 68 6f 73 74 |/sbin/sshd"
> host|
> 
> Then I deliberately overwrote 32 bytes using random data:
> 
> INFRA [root@ceph-vm-lab5 ~]# dd if=/dev/urandom of=/dev/sdd bs=32
> count=1 seek=33920
> 
> INFRA [root@ceph-vm-lab5 ~]# dd if=/dev/sdd bs=32 count=1 skip=33920 |
> hexdump -C
>   25 75 af 3e 87 b0 3b 04  78 ba 79 e3 64 fc 76 d2
>|%u.>..;.x.y.d.v.|
> 0010  9e 94 00 c2 45 a5 e1 d2  a8 86 f1 25 fc 18 07 5a
>|E..%...Z|
> 
> At this point I would expect some sort of data corruption. I restarted
> the OSD daemon on this host to make sure it flushes any potentially
> buffered data. It restarted OK without noticing anything, which was
> expected.
> 
> Then I ran
> 
> ceph osd scrub 5
> 
> ceph osd deep-scrub 5
> 
> And waiting for all scheduled scrub operations for all PGs to finish.
> 
> No inconsistency was found. No errors reported, scrubs just finished OK,
> data are still visibly corrupt via hexdump.
> 
> Did I just hit some block of data that WAS used by OSD, but was marked
> deleted and therefore no longer used or am I missing something?

Possibly, if you deep-scrubed all PGs. I remember marking bad sectors in the 
past and still getting a fsck success on ceph-bluestore-tool fsck.

To be sure, you could overwrite the very same sector, stop the OSD and then:

$ ceph-bluestore-tool fsck --deep yes --path /var/lib/ceph/osd/ceph-X/

or (in containerized environment)

$ cephadm shell --name osd.X ceph-bluestore-tool fsck --deep yes --path 
/var/lib/ceph/osd/ceph-X/

osd.X being the OSD associated to drive /dev/sdd.

Regards,
Frédéric.


> I would expect CEPH to detect disk corruption and automatically replace the
> invalid data with a valid copy?
> 
> I use only replica pools in this lab setup, for RBD and CephFS.
> 
> Thanks
> 
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Testing CEPH scrubbing / self-healing capabilities

2024-06-05 Thread Eugen Block

Do you have osd_scrub_auto_repair set to true?

Zitat von Petr Bena :


Hello,

I wanted to try out (lab ceph setup) what exactly is going to happen  
when parts of data on OSD disk gets corrupted. I created a simple  
test where I was going through the block device data until I found  
something that resembled user data (using dd and hexdump) (/dev/sdd  
is a block device that is used by OSD)


INFRA [root@ceph-vm-lab5 ~]# dd if=/dev/sdd bs=32 count=1 skip=33920  
| hexdump -C
  6e 20 69 64 3d 30 20 65  78 65 3d 22 2f 75 73 72  |n id=0  
exe="/usr|

0010  2f 73 62 69 6e 2f 73 73  68 64 22 20 68 6f 73 74 |/sbin/sshd" host|

Then I deliberately overwrote 32 bytes using random data:

INFRA [root@ceph-vm-lab5 ~]# dd if=/dev/urandom of=/dev/sdd bs=32  
count=1 seek=33920


INFRA [root@ceph-vm-lab5 ~]# dd if=/dev/sdd bs=32 count=1 skip=33920  
| hexdump -C

  25 75 af 3e 87 b0 3b 04  78 ba 79 e3 64 fc 76 d2 |%u.>..;.x.y.d.v.|
0010  9e 94 00 c2 45 a5 e1 d2  a8 86 f1 25 fc 18 07 5a |E..%...Z|

At this point I would expect some sort of data corruption. I  
restarted the OSD daemon on this host to make sure it flushes any  
potentially buffered data. It restarted OK without noticing  
anything, which was expected.


Then I ran

ceph osd scrub 5

ceph osd deep-scrub 5

And waiting for all scheduled scrub operations for all PGs to finish.

No inconsistency was found. No errors reported, scrubs just finished  
OK, data are still visibly corrupt via hexdump.


Did I just hit some block of data that WAS used by OSD, but was  
marked deleted and therefore no longer used or am I missing  
something? I would expect CEPH to detect disk corruption and  
automatically replace the invalid data with a valid copy?


I use only replica pools in this lab setup, for RBD and CephFS.

Thanks

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io



___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io