[ceph-users] Re: How to recover from active+clean+inconsistent+failed_repair?

Frank Schilder Mon, 02 Nov 2020 11:31:50 -0800

> But there can be a on chip disk controller on the motherboard, I'm not sure.


There is always some kind of controller. Could be on-board. Usually, the cache 
settings are accessible when booting into the BIOS set-up.

> If your worry is fsync persistence

No, what I worry about is volatile write cache, which is usually enabled by 
default. This cache exists on disk as well as on controller. To avoid loosing 
writes on power fail, the controller needs to be in write-through mode and the 
disk write cache disabled. The latter can be done with smartctl, the former in 
the BIOS setup.

Did you test power failure? If so, how often? On how many hosts simultaneously? 
Pulling network cables will not trigger cache related problems. The problem 
with write cache is, that you rely on a lot of bells and whistles where some 
usually fail. With ceph, this will lead to exactly the problem you are 
observing now.

Your pool configuration looks OK. You need to find out where exactly the scrub 
errors are situated. It looks like meta-data damage and you might loose some 
data. Be careful to do only read-only admin operations for now.

Best regards,
=================
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14

________________________________________
From: Sagara Wijetunga <sagara...@yahoo.com>
Sent: 02 November 2020 16:08:58
To: ceph-users@ceph.io; Frank Schilder
Subject: Re: [ceph-users] Re: How to recover from 
active+clean+inconsistent+failed_repair?

> Hmm, I'm getting a bit confused. Could you also send the output of "ceph osd 
> pool ls detail".

File ceph-osd-pool-ls-detail.txt attached.


> Did you look at the disk/controller cache settings?

I don't have disk controllers on Ceph machines. The hard disk is directly 
attached to the motherboard via SATA cable. But there can be a on chip disk 
controller on the motherboard, I'm not sure.

If your worry is fsync persistence, I have thoroughly tested database fsync 
reliability on Ceph RBD with hundreds of transactions per second and remove 
network cable and restart the database machine, etc. while inserts going on. 
and I did not lose a single transaction. I simulated this many times and 
persistence on my Ceph cluster was perfect (i.e not a single loss).


> I think you should start a deep-scrub with "ceph pg deep-scrub 3.b" and 
> record the output of "ceph -w | grep '3\.b'" (note the single quotes).

> The error messages you included in one of your first e-mails are only on 1 
> out of 3 scrub errors (3 lines for 1 error). We need to find all 3 errors.

I ran again the "ceph pg deep-scrub 3.b", here is the whole output of ceph -w:


2020-11-02 22:33:48.224392 osd.0 [ERR] 3.b shard 2 soid 
3:d577e975:::1000023675e.00000000:head : candidate had a missing snapset key, 
candidate had a missing info key


2020-11-02 22:33:48.224396 osd.0 [ERR] 3.b soid 
3:d577e975:::1000023675e.00000000:head : failed to pick suitable object info


2020-11-02 22:35:30.087042 osd.0 [ERR] 3.b deep-scrub 3 errors


Btw, I'm very grateful for your perseverance on this.


Best regards

Sagara

_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: How to recover from active+clean+inconsistent+failed_repair?

Reply via email to