This might seem like a stupid suggestion, but: have you tried to
restart the OSDs?
I've also encountered some random CRC errors that only showed up when
trying to read an object,
but not on scrubbing, that magically disappeared after restarting the
OSD.
However, in my case it was clearly related to
https://tracker.ceph.com/issues/22464 which doesn't
seem to be the issue here.
Paul
2018-07-12 13:53 GMT+02:00 Alessandro De Salvo
<alessandro.desa...@roma1.infn.it
<mailto:alessandro.desa...@roma1.infn.it>>:
Il 12/07/18 11:20, Alessandro De Salvo ha scritto:
Il 12/07/18 10:58, Dan van der Ster ha scritto:
On Wed, Jul 11, 2018 at 10:25 PM Gregory Farnum
<gfar...@redhat.com <mailto:gfar...@redhat.com>> wrote:
On Wed, Jul 11, 2018 at 9:23 AM Alessandro De Salvo
<alessandro.desa...@roma1.infn.it
<mailto:alessandro.desa...@roma1.infn.it>> wrote:
OK, I found where the object is:
ceph osd map cephfs_metadata 200.00000000
osdmap e632418 pool 'cephfs_metadata' (10) object
'200.00000000' -> pg
10.844f3494 (10.14) -> up ([23,35,18], p23)
acting ([23,35,18], p23)
So, looking at the osds 23, 35 and 18 logs in
fact I see:
osd.23:
2018-07-11 15:49:14.913771 7efbee672700 -1
log_channel(cluster) log
[ERR] : 10.14 full-object read crc 0x976aefc5 !=
expected 0x9ef2b41b on
10:292cf221:::200.00000000:head
osd.35:
2018-07-11 18:01:19.989345 7f760291a700 -1
log_channel(cluster) log
[ERR] : 10.14 full-object read crc 0x976aefc5 !=
expected 0x9ef2b41b on
10:292cf221:::200.00000000:head
osd.18:
2018-07-11 18:18:06.214933 7fabaf5c1700 -1
log_channel(cluster) log
[ERR] : 10.14 full-object read crc 0x976aefc5 !=
expected 0x9ef2b41b on
10:292cf221:::200.00000000:head
So, basically the same error everywhere.
I'm trying to issue a repair of the pg 10.14, but
I'm not sure if it may
help.
No SMART errors (the fileservers are SANs, in
RAID6 + LVM volumes), and
no disk problems anywhere. No relevant errors in
syslogs, the hosts are
just fine. I cannot exclude an error on the RAID
controllers, but 2 of
the OSDs with 10.14 are on a SAN system and one
on a different one, so I
would tend to exclude they both had (silent)
errors at the same time.
That's fairly distressing. At this point I'd probably
try extracting the object using ceph-objectstore-tool
and seeing if it decodes properly as an mds journal.
If it does, you might risk just putting it back in
place to overwrite the crc.
Wouldn't it be easier to scrub repair the PG to fix the crc?
this is what I already instructed the cluster to do, a deep
scrub, but I'm not sure it could repair in case all replicas
are bad, as it seems to be the case.
I finally managed (with the help of Dan), to perform the
deep-scrub on pg 10.14, but the deep scrub did not detect
anything wrong. Also trying to repair 10.14 has no effect.
Still, trying to access the object I get in the OSDs:
2018-07-12 13:40:32.711732 7efbee672700 -1 log_channel(cluster)
log [ERR] : 10.14 full-object read crc 0x976aefc5 != expected
0x9ef2b41b on 10:292cf221:::200.00000000:head
Was deep-scrub supposed to detect the wrong crc? If yes, them it
sounds like a bug.
Can I force the repair someway?
Thanks,
Alessandro
Alessandro, did you already try a deep-scrub on pg 10.14?
I'm waiting for the cluster to do that, I've sent it earlier
this morning.
I expect
it'll show an inconsistent object. Though, I'm unsure if
repair will
correct the crc given that in this case *all* replicas
have a bad crc.
Exactly, this is what I wonder too.
Cheers,
Alessandro
--Dan
However, I'm also quite curious how it ended up that
way, with a checksum mismatch but identical data (and
identical checksums!) across the three replicas. Have
you previously done some kind of scrub repair on the
metadata pool? Did the PG perhaps get backfilled due
to cluster changes?
-Greg
Thanks,
Alessandro
Il 11/07/18 18:56, John Spray ha scritto:
On Wed, Jul 11, 2018 at 4:49 PM Alessandro De
Salvo
<alessandro.desa...@roma1.infn.it
<mailto:alessandro.desa...@roma1.infn.it>> wrote:
Hi John,
in fact I get an I/O error by hand too:
rados get -p cephfs_metadata 200.00000000
200.00000000
error getting
cephfs_metadata/200.00000000: (5)
Input/output error
Next step would be to go look for
corresponding errors on your OSD
logs, system logs, and possibly also check
things like the SMART
counters on your hard drives for possible
root causes.
John
Can this be recovered someway?
Thanks,
Alessandro
Il 11/07/18 18:33, John Spray ha scritto:
On Wed, Jul 11, 2018 at 4:10 PM
Alessandro De Salvo
<alessandro.desa...@roma1.infn.it
<mailto:alessandro.desa...@roma1.infn.it>>
wrote:
Hi,
after the upgrade to luminous
12.2.6 today, all our MDSes have been
marked as damaged. Trying to
restart the instances only result in
standby MDSes. We currently have
2 filesystems active and 2 MDSes
each.
I found the following error
messages in the mon:
mds.0 <node1_IP>:6800/2412911269
down:damaged
mds.1 <node2_IP>:6800/830539001
down:damaged
mds.0 <node3_IP>:6800/4080298733
down:damaged
Whenever I try to force the
repaired state with ceph mds repaired
<fs_name>:<rank> I get something
like this in the MDS logs:
2018-07-11 13:20:41.597970
7ff7e010e700 0
mds.1.journaler.mdlog(ro)
error getting journal off disk
2018-07-11 13:20:41.598173
7ff7df90d700 -1
log_channel(cluster) log
[ERR] : Error recovering journal
0x201: (5) Input/output error
An EIO reading the journal header is
pretty scary. The MDS itself
probably can't tell you much more
about this: you need to dig down
into the RADOS layer. Try reading
the 200.00000000 object (that
happens to be the rank 0 journal
header, every CephFS filesystem
should have one) using the `rados`
command line tool.
John
Any attempt of running the
journal export results in errors,
like this one:
cephfs-journal-tool
--rank=cephfs:0 journal export
backup.bin
Error ((5) Input/output
error)2018-07-11 17:01:30.631571
7f94354fff00 -1
Header 200.00000000 is unreadable
2018-07-11 17:01:30.631584
7f94354fff00 -1 journal_export:
Journal not
readable, attempt
object-by-object dump with `rados`
Same happens for recover_dentries
cephfs-journal-tool
--rank=cephfs:0 event
recover_dentries summary
Events by type:2018-07-11
17:04:19.770779 7f05429fef00 -1
Header
200.00000000 is unreadable
Errors:
0
Is there something I could try to
do to have the cluster back?
I was able to dump the contents
of the metadata pool with rados
export
-p cephfs_metadata <filename> and
I'm currently trying the procedure
described in
http://docs.ceph.com/docs/master/cephfs/disaster-recovery-experts/#using-an-alternate-metadata-pool-for-recovery
<http://docs.ceph.com/docs/master/cephfs/disaster-recovery-experts/#using-an-alternate-metadata-pool-for-recovery>
but I'm not sure if it will work
as it's apparently doing nothing
at the
moment (maybe it's just very slow).
Any help is appreciated, thanks!
Alessandro
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
<mailto:ceph-users@lists.ceph.com>
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
<http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com>
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
<mailto:ceph-users@lists.ceph.com>
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
<http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com>
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
<mailto:ceph-users@lists.ceph.com>
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
<http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com>
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com <mailto:ceph-users@lists.ceph.com>
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
<http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com>
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com <mailto:ceph-users@lists.ceph.com>
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
<http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com>
--
Paul Emmerich
Looking for help with your Ceph cluster? Contact us at https://croit.io
croit GmbH
Freseniusstr. 31h
81247 München
www.croit.io <http://www.croit.io>
Tel: +49 89 1896585 90