Hi,
Turns out there is a problem with the zpool, which we think got corrupted
by a stonith event when a disk on another pool started to do a predicted
failure.
A zpool scrub has been done, and there are 5 files with permanent errors
(zpool status -v):
errors: Permanent errors have been detected in the following files:
cos8-ost6/ost6:<0xe>
cos8-ost6/ost6:<0x1a>
cos8-ost6/ost6:<0x1c>
cos8-ost6/ost6:/
cos8-ost6/ost6:<0x193>
The fact that / is corrupted seems to worry me!
If we set the canmount=on property and mount the zpool, then an ls of the
mount point gives an Input/output error.
Does anyone have experience with how to repair this?
There is no hardware problem, all 12 disks within this z2 pool are fine -
we think the stonith must have caused it - though I thought zfs was
supposed to be immune to that!
Thanks...
On Tue, 30 Nov 2021, Tommi Tervo wrote:
[EXTERNAL EMAIL]
Upon attempting to mount a zfs OST, we are getting:
Message from syslogd@c8oss01 at Nov 29 18:11:47 ...
kernel:LustreError: 58223:0:(lu_object.c:1267:lu_device_fini())
ASSERTION( atomic_read(&d->ld_ref) == 0 ) failed: Refcount is 1
Message from syslogd@c8oss01 at Nov 29 18:11:47 ...
kernel:LustreError: 58223:0:(lu_object.c:1267:lu_device_fini()) LBUG
Hi,
Looks like LU-12675, time to upgrade 2.12.7?
https://jira.whamcloud.com/browse/LU-12675
HTH,
-Tommi
_______________________________________________
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org