Hi Matthew and all,

thank you for taking action immediately. I really appreciate your
effort.

After investigating the issue further I have to add that the mount
option discard seems to trigger the issue, too.

@Trent
The general problem here is that RAID10 can balance single read streams to all 
disks (which is probably the major advantage over RAID1 effectively providing 
you RAID0 read speed; RAID1 needs parallel reads to achieve this).

That said it is no big surprise that several machines at our site went to 
readonly mode after *some time* (probably reading some filesystem relevant data 
from the "bad disk"). Unfortunately the "clean first disk" only happens if you 
act immediately, otherwise you might have some data corruption.
I verified this on one system where the root partition was affected using the 
debsums tool (just run debsums -xa) after fixing FS errors.

My procedure to recover was:
Assembly of the RAID:
mdadm --assemble /dev/md127 /dev/nvme0n1p2
mdadm --run /dev/md127

Filesystem check on all partitions (note the -f parameter, some FS "think" they 
are clean):
fsck.ext4 -f /dev/VolGroup/...

Re-add the second component:
mdadm --zero-superblock /dev/nvme1n1p2
mdadm --add /dev/md127 /dev/nvme1n1p2

Best regards

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1907262

Title:
  raid10: discard leads to corrupted file system

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1907262/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to