[ceph-users] Re: PG damaged "failed_repair"

2024-03-25 Thread romain . lebbadi-breteau
Hi,

Sorry for the broken formatting. Here are the outputs again.

ceph osd df:

ID  CLASS  WEIGHT   REWEIGHT  SIZE RAW USE  DATA  OMAP META 
AVAIL%USE   VAR   PGS  STATUS
 3hdd  1.81879 0  0 B  0 B   0 B  0 B  0 B  
0 B  0 00down
12hdd  1.81879   1.0  1.8 TiB  385 GiB   383 GiB  6.7 MiB  1.4 GiB  1.4 
TiB  20.66  1.73   18  up
13hdd  1.81879   1.0  1.8 TiB  422 GiB   421 GiB  5.8 MiB  1.3 GiB  1.4 
TiB  22.67  1.90   17  up
15hdd  1.81879   1.0  1.8 TiB  264 GiB   263 GiB  4.6 MiB  1.1 GiB  1.6 
TiB  14.17  1.19   14  up
16hdd  9.09520   1.0  9.1 TiB  1.0 TiB  1023 GiB  8.8 MiB  2.6 GiB  8.1 
TiB  11.01  0.92   65  up
17hdd  1.81879   1.0  1.8 TiB  319 GiB   318 GiB  6.1 MiB  1.0 GiB  1.5 
TiB  17.13  1.43   15  up
 1hdd  5.45749   1.0  5.5 TiB  546 GiB   544 GiB  7.8 MiB  1.4 GiB  4.9 
TiB   9.76  0.82   29  up
 4hdd  5.45749   1.0  5.5 TiB  801 GiB   799 GiB  8.3 MiB  2.4 GiB  4.7 
TiB  14.34  1.20   44  up
 8hdd  5.45749   1.0  5.5 TiB  708 GiB   706 GiB  9.7 MiB  2.1 GiB  4.8 
TiB  12.67  1.06   36  up
11hdd  5.45749 0  0 B  0 B   0 B  0 B  0 B  
0 B  0 00down
14hdd  1.81879   1.0  1.8 TiB  200 GiB   198 GiB  3.8 MiB  1.3 GiB  1.6 
TiB  10.71  0.90   10  up
 0hdd  9.09520 0  0 B  0 B   0 B  0 B  0 B  
0 B  0 00down
 5hdd  9.09520   1.0  9.1 TiB  859 GiB   857 GiB   17 MiB  2.1 GiB  8.3 
TiB   9.23  0.77   46  up
 9hdd  9.09520   1.0  9.1 TiB  924 GiB   922 GiB   11 MiB  2.3 GiB  8.2 
TiB   9.92  0.83   55  up
   TOTAL   53 TiB  6.3 TiB   6.3 TiB   90 MiB   19 GiB   46 
TiB  11.95   
MIN/MAX VAR: 0.77/1.90  STDDEV: 4.74

ceph osd pool ls detail :

pool 1 '.mgr' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins 
pg_num 1 pgp_num 1 autoscale_mode on last_change 32 flags hashpspool 
stripe_width 0 pg_num_max 32 pg_num_min 1 application mgr
pool 2 'volumes' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins 
pg_num 32 pgp_num 32 autoscale_mode on last_change 9327 lfor 0/0/104 flags 
hashpspool,selfmanaged_snaps stripe_width 0 application rbd
pool 3 'images' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins 
pg_num 32 pgp_num 32 autoscale_mode on last_change 9018 lfor 0/0/104 flags 
hashpspool,selfmanaged_snaps stripe_width 0 application rbd
pool 4 'vms' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins 
pg_num 32 pgp_num 32 autoscale_mode on last_change 9149 lfor 0/0/106 flags 
hashpspool,selfmanaged_snaps stripe_width 0 application rbd
pool 5 'polyphoto_backup' replicated size 3 min_size 2 crush_rule 0 object_hash 
rjenkins pg_num 32 pgp_num 32 autoscale_mode on last_change 372 lfor 0/0/362 
flags hashpspool,selfmanaged_snaps stripe_width 0 compression_algorithm snappy 
compression_mode aggressive application rbd

The error seems to come from a software error in Ceph. I see this error in the 
logs : "FAILED ceph_assert(clone_overlap.count(clone))"

Thanks,
Romain Lebbadi-Breteau
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: PG damaged "failed_repair"

2024-03-10 Thread Romain Lebbadi-Breteau

Hi,

Sorry for the bad formatting. Here are the outputs again.

ceph osd df :

ID  CLASS  WEIGHT   REWEIGHT  SIZE RAW USE  DATA OMAP META 
AVAIL    %USE   VAR   PGS  STATUS
 3    hdd  1.81879 0  0 B  0 B   0 B  0 B  
0 B  0 B  0 0    0    down
12    hdd  1.81879   1.0  1.8 TiB  385 GiB   383 GiB  6.7 MiB 1.4 
GiB  1.4 TiB  20.66  1.73   18  up
13    hdd  1.81879   1.0  1.8 TiB  422 GiB   421 GiB  5.8 MiB 1.3 
GiB  1.4 TiB  22.67  1.90   17  up
15    hdd  1.81879   1.0  1.8 TiB  264 GiB   263 GiB  4.6 MiB 1.1 
GiB  1.6 TiB  14.17  1.19   14  up
16    hdd  9.09520   1.0  9.1 TiB  1.0 TiB  1023 GiB  8.8 MiB 2.6 
GiB  8.1 TiB  11.01  0.92   65  up
17    hdd  1.81879   1.0  1.8 TiB  319 GiB   318 GiB  6.1 MiB 1.0 
GiB  1.5 TiB  17.13  1.43   15  up
 1    hdd  5.45749   1.0  5.5 TiB  546 GiB   544 GiB  7.8 MiB 1.4 
GiB  4.9 TiB   9.76  0.82   29  up
 4    hdd  5.45749   1.0  5.5 TiB  801 GiB   799 GiB  8.3 MiB 2.4 
GiB  4.7 TiB  14.34  1.20   44  up
 8    hdd  5.45749   1.0  5.5 TiB  708 GiB   706 GiB  9.7 MiB 2.1 
GiB  4.8 TiB  12.67  1.06   36  up
11    hdd  5.45749 0  0 B  0 B   0 B  0 B  0 
B  0 B  0 0    0    down
14    hdd  1.81879   1.0  1.8 TiB  200 GiB   198 GiB  3.8 MiB 1.3 
GiB  1.6 TiB  10.71  0.90   10  up
 0    hdd  9.09520 0  0 B  0 B   0 B  0 B  
0 B  0 B  0 0    0    down
 5    hdd  9.09520   1.0  9.1 TiB  859 GiB   857 GiB   17 MiB 2.1 
GiB  8.3 TiB   9.23  0.77   46  up
 9    hdd  9.09520   1.0  9.1 TiB  924 GiB   922 GiB   11 MiB 2.3 
GiB  8.2 TiB   9.92  0.83   55  up
   TOTAL   53 TiB  6.3 TiB   6.3 TiB   90 MiB   19 
GiB   46 TiB  11.95

MIN/MAX VAR: 0.77/1.90  STDDEV: 4.74

ceph osd pool ls detail :

pool 1 '.mgr' replicated size 3 min_size 2 crush_rule 0 object_hash 
rjenkins pg_num 1 pgp_num 1 autoscale_mode on last_change 32 flags 
hashpspool stripe_width 0 pg_num_max 32 pg_num_min 1 application mgr
pool 2 'volumes' replicated size 3 min_size 2 crush_rule 0 object_hash 
rjenkins pg_num 32 pgp_num 32 autoscale_mode on last_change 9327 lfor 
0/0/104 flags hashpspool,selfmanaged_snaps stripe_width 0 application rbd
pool 3 'images' replicated size 3 min_size 2 crush_rule 0 object_hash 
rjenkins pg_num 32 pgp_num 32 autoscale_mode on last_change 9018 lfor 
0/0/104 flags hashpspool,selfmanaged_snaps stripe_width 0 application rbd
pool 4 'vms' replicated size 3 min_size 2 crush_rule 0 object_hash 
rjenkins pg_num 32 pgp_num 32 autoscale_mode on last_change 9149 lfor 
0/0/106 flags hashpspool,selfmanaged_snaps stripe_width 0 application rbd
pool 5 'polyphoto_backup' replicated size 3 min_size 2 crush_rule 0 
object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode on last_change 
372 lfor 0/0/362 flags hashpspool,selfmanaged_snaps stripe_width 0 
compression_algorithm snappy compression_mode aggressive application rbd


The error seems to come from a software error in Ceph. In the logs, I 
get the message "FAILED ceph_assert(clone_overlap.count(clone))".


Thanks,

Romain Lebbadi-Breteau
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: PG damaged "failed_repair"

2024-03-08 Thread Romain Lebbadi-Breteau

Hi,

Yes we're trying to remove the osd.3. Here is the result of `ceph osd df` :

IDCLASSWEIGHTREWEIGHTSIZERAWUSEDATAOMAPMETAAVAIL%USEVARPGSSTATUS
3hdd1.818791.01.8TiB443GiB441GiB6.8MiB1.5GiB1.4TiB23.782.3716up
6hdd1.818791.01.8TiB114GiB114GiB981KiB343MiB1.7TiB6.140.618up
12hdd1.818791.01.8TiB359GiB358GiB5.8MiB1.0GiB1.5TiB19.271.9215up
13hdd1.818791.01.8TiB331GiB330GiB3.9MiB1.5GiB1.5TiB17.771.7715up
15hdd1.818791.01.8TiB217GiB216GiB2.0MiB1.1GiB1.6TiB11.641.1613up
16hdd9.095201.09.1TiB785GiB783GiB8.8MiB1.9GiB8.3TiB8.430.8451up
17hdd1.818791.01.8TiB204GiB203GiB2.9MiB1.2GiB1.6TiB10.951.0911up
1hdd5.457491.05.5TiB428GiB427GiB4.9MiB876MiB5.0TiB7.660.7624up
4hdd5.457491.05.5TiB638GiB636GiB6.8MiB2.2GiB4.8TiB11.421.1436up
8hdd5.457491.05.5TiB594GiB591GiB8.7MiB2.2GiB4.9TiB10.621.0630up
11hdd5.457491.05.5TiB567GiB565GiB7.8MiB2.1GiB4.9TiB10.151.0129up
14hdd1.818791.01.8TiB197GiB195GiB2.9MiB1.2GiB1.6TiB10.551.0510up
0hdd9.095201.09.1TiB764GiB763GiB9.6MiB1.8GiB8.3TiB8.210.8247up
5hdd9.095201.09.1TiB791GiB789GiB11MiB2.6GiB8.3TiB8.500.8538up
9hdd9.095201.09.1TiB858GiB856GiB11MiB2.4GiB8.3TiB9.210.9244up
TOTAL71TiB7.1TiB7.1TiB93MiB24GiB64TiB10.03
MIN/MAXVAR:0.61/2.37STDDEV:4.97

And here id `ceph osd pool ls detail` (but yes our replicated size if 3) :

pool1'.mgr'replicatedsize3min_size2crush_rule0object_hashrjenkinspg_num1pgp_num1autoscale_modeonlast_change32flagshashpspoolstripe_width0pg_num_max32pg_num_min1applicationmgr
pool2'volumes'replicatedsize3min_size2crush_rule0object_hashrjenkinspg_num32pgp_num32autoscale_modeonlast_change9327lfor0/0/104flagshashpspool,selfmanaged_snapsstripe_width0applicationrbd
pool3'images'replicatedsize3min_size2crush_rule0object_hashrjenkinspg_num32pgp_num32autoscale_modeonlast_change9018lfor0/0/104flagshashpspool,selfmanaged_snapsstripe_width0applicationrbd
pool4'vms'replicatedsize3min_size2crush_rule0object_hashrjenkinspg_num32pgp_num32autoscale_modeonlast_change9149lfor0/0/106flagshashpspool,selfmanaged_snapsstripe_width0applicationrbd
pool5'polyphoto_backup'replicatedsize3min_size2crush_rule0object_hashrjenkinspg_num32pgp_num32autoscale_modeonlast_change372lfor0/0/362flagshashpspool,selfmanaged_snapsstripe_width0compression_algorithmsnappycompression_modeaggressiveapplicationrbd

And we're using quincy :

romain:step@alpha-cen~ $ sudoceph--version
cephversion17.2.6(d7ff0d10654d2280e08f1ab989c7cdf3064446a5) quincy (stable)

All our physical disks are on their own RAID 0 using the built-in Raid 
Controller (PERC H730 Mini).


For the logs, journalctl has rotated and we don't have them anymore. I 
recreated the situation where the three OSDs crash (shutting down osd.3 
and marking it out), and here are the logs :


ceph -w : https://pastebin.com/A7gJ3ss2
<https://pastebin.com/A7gJ3ss2>osd.0, osd.3 and osd.11 : 
https://gitlab.com/RomainL456/ceph-incident-logs/


I put the full logs (output of `sudo journalctl --since "23:00" -u 
ceph-9b4b12fe-4dc6-11ed-9ed9-d18a342d7c2b@osd.*`) in a public Git repo, 
and I also put a file for the logs right before osd.0 crashed. Here is 
the timeline of events (local time) :


23:27 : I manually shut down osd.3
23:46 : osd.0 crashes
23:46 : osd.11 crashes
23:48 : I start osd.3, it crashes in less than a minute
23:49 : After I mark osd.3 "in" and start it again, it comes back online 
with osd.0 and osd.11 soon after


Best regards,

Romain Lebbadi-Breteau

On 2024-03-08 3:17 a.m., Eugen Block wrote:

Hi,

can you share more details? Which OSD are you trying to get out, the 
primary osd.3?

Can you also share 'ceph osd df'?
It looks like a replicated pool with size 3, can you confirm with 
'ceph osd pool ls detail'?

Do you have logs from the crashing OSDs when you take out osd.3?
Which ceph version is this?

Thanks,
Eugen

Zitat von Romain Lebbadi-Breteau :


Hi,

We're a student club from Montréal where we host an Openstack cloud 
with a Ceph backend for storage of virtual machines and volumes using 
rbd.


Two weeks ago we received an email from our ceph cluster saying that 
some pages were damaged. We ran "sudo ceph pg repair " but 
then there was an I/O error on the disk during the recovery ("An 
unrecoverable disk media error occurred on Disk 4 in Backplane 1 of 
Integrated RAID Controller 1." and "Bad block medium error is 
detected at block 0x1377e2ad on Virtual Disk 3 on Integrated RAID 
Controller 1." messages on iDRAC).


After that, the PG we tried to repair was in the state 
"active+recovery_unfound+degraded". After a week, we ran the command 
"sudo ceph pg 2.1b mark_unfound_lost revert" to try to recover the 
damaged PG. We tried to boot the virtual machine that had crashed 
because of this incident, but the volume seemed to have been 
completely erased, the "mount" command said there was no filesystem 
on it, so we recreated the VM from a backup.


A few days later, the same PG was once again d

[ceph-users] PG damaged "failed_repair"

2024-03-06 Thread Romain Lebbadi-Breteau

Hi,

We're a student club from Montréal where we host an Openstack cloud with 
a Ceph backend for storage of virtual machines and volumes using rbd.


Two weeks ago we received an email from our ceph cluster saying that 
some pages were damaged. We ran "sudo ceph pg repair " but then 
there was an I/O error on the disk during the recovery ("An 
unrecoverable disk media error occurred on Disk 4 in Backplane 1 of 
Integrated RAID Controller 1." and "Bad block medium error is detected 
at block 0x1377e2ad on Virtual Disk 3 on Integrated RAID Controller 1." 
messages on iDRAC).


After that, the PG we tried to repair was in the state 
"active+recovery_unfound+degraded". After a week, we ran the command 
"sudo ceph pg 2.1b mark_unfound_lost revert" to try to recover the 
damaged PG. We tried to boot the virtual machine that had crashed 
because of this incident, but the volume seemed to have been completely 
erased, the "mount" command said there was no filesystem on it, so we 
recreated the VM from a backup.


A few days later, the same PG was once again damaged, and since we knew 
the physical disk on the OSD hosting one part of the PG had problems, we 
tried to "out" the OSD from the cluster. That resulted in the two other 
OSDs hosting copies of the problematic PG to go down, which caused 
timeouts on our virtual machines, so we put the OSD back in.


We then tried to repair the PG again, but that failed and the PG is now 
"active+clean+inconsistent+failed_repair", and whenever it goes down, 
two other OSDs from two other hosts go down too after a few minutes, so 
it's impossible to replace the disk right now, even if we have new ones 
available.


We have backups for most of our services, but it would be very 
disrupting to delete the whole cluster, and we don't know that to do 
with the broken PG and the OSD that can't be shut down.


Any help would be really appreciated, we're not experts with Ceph and 
Openstack, and it's likely we handled things wrong at some point, but we 
really want to go back to a healthy Ceph.


Here are some information about our cluster :

romain:step@alpha-cen ~  $ sudo ceph health detail
HEALTH_ERR 1 scrub errors; Possible data damage: 1 pg inconsistent
[ERR] OSD_SCRUB_ERRORS: 1 scrub errors
[ERR] PG_DAMAGED: Possible data damage: 1 pg inconsistent
    pg 2.1b is active+clean+inconsistent+failed_repair, acting [3,11,0]

romain:step@alpha-cen ~  $ sudo ceph osd tree
ID  CLASS  WEIGHT    TYPE NAME   STATUS  REWEIGHT  PRI-AFF
-1 70.94226  root default
-7 20.00792  host alpha-cen
 3    hdd   1.81879  osd.3   up   1.0  1.0
 6    hdd   1.81879  osd.6   up   1.0  1.0
12    hdd   1.81879  osd.12  up   1.0  1.0
13    hdd   1.81879  osd.13  up   1.0  1.0
15    hdd   1.81879  osd.15  up   1.0  1.0
16    hdd   9.09520  osd.16  up   1.0  1.0
17    hdd   1.81879  osd.17  up   1.0  1.0
-5 23.64874  host beta-cen
 1    hdd   5.45749  osd.1   up   1.0  1.0
 4    hdd   5.45749  osd.4   up   1.0  1.0
 8    hdd   5.45749  osd.8   up   1.0  1.0
11    hdd   5.45749  osd.11  up   1.0  1.0
14    hdd   1.81879  osd.14  up   1.0  1.0
-3 27.28560  host gamma-cen
 0    hdd   9.09520  osd.0   up   1.0  1.0
 5    hdd   9.09520  osd.5   up   1.0  1.0
 9    hdd   9.09520  osd.9   up   1.0  1.0

romain:step@alpha-cen ~  $ sudo rados list-inconsistent-obj 2.1b
{"epoch":9787,"inconsistents":[]}

romain:step@alpha-cen ~  $ sudo ceph pg 2.1b query

https://pastebin.com/gsKCPCjr

Best regards,

Romain Lebbadi-Breteau
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io