Earlier this year I've experienced this bug and had to investigate
its cause and remedy.

First Let me start with background, rationale and a rant about
epic fail of incremental assembly:

:::

RAID-1 is for reliability, availability and fault-tolerance.

Consider configuration where `/home` partition is on 3 HDDs, arranged
into mdRAID-1, with a hot spare. Once HDD dies/disappears, a (hot)
spare is immediately used to replace bad/missing disk and restore
200% data redundancy. That is what's expected from such array and
it is how things used to work on Debian few releases ago.

With introduction of "incremental assembly" in mdadm, *principle of
least surprise* is violated: on boot mdadm add disks to array using
`udev` handler and when (failed or disconnected) disk never appears,
system falls into emergency recovery shell instead of recovering
array automatically. (That's what hot spare is for!)

This is an epic disaster in mdadm behaviour for few reasons:

 1) Emergency recovery mode starts early, before network and SSH
    service. Hence instead of increased fault tolerance, we have
    machine that needs console access because it is not bootable
    normally, and even to experienced admin it is not terribly
    obvious how to repair partially assembled RAID1 array with
    missing disk. This is betrayal (of expectations) number one,
    compromising availability that RAID-1 is meant to protect.

 2) Hot spare is there so that (presumably sensitive) data could be
    immediately replicated to second device, in order to protect
    valuable data from eventual failure of the only remaining HDD.
    Not using hot spare automatically is betrayal number two, because
    delayed replication that requires admin interaction jeopardise
    data recovery by delaying (actually not doing) recovery,
    increasing odds of catastrophic data loss.

Frankly I'm shocked how badly mdadm handles recoverable failures
nowadays, and what tremendous regression that is from former correct
behaviour.

Remedy is not even documented and it is not clear how to revert to
former behaviour where RAID1 is automatically recovered using hot
spare HDD without compromising boot...

:::

The essence of this problem as I understand it, is that incremental
assembly happens (all available disks are discovered) but degraded
arrays are not *run*.

As workaround I've invoked `mdadm -q --run /dev/md?*` from
`/etc/initramfs-tools/scripts/local-premount/mdadm` as follows:


```
#!/bin/sh
PREREQ="mdadm"
prereqs() { echo "$PREREQ"; }
case $1 in prereqs) prereqs; exit 0;; esac

. /scripts/functions

## This is required to auto-rebuild degraded arrays with spares, instead of 
falling to emergency recovery mode.
## See https://bugs.debian.org/1094629 for details.
log_begin_msg "mdadm: Ensure that all arrays are running."
mdadm -q --run /dev/md?* || true

sleep 2
log_end_msg
```

Followed by
    sudo chown -c root:root /etc/initramfs-tools/scripts/local-premount/mdadm
    sudo chmod -c u+x /etc/initramfs-tools/scripts/local-premount/mdadm
    sudo update-initramfs -u -k all

That fixed the problem nicely for me.

I had to use this workaround on several machines.

I've thoroughly tested that approach on few PCs by randomly
disconnecting disks and ensuring that machines remain bootable,
with automatic mdraid recovery from available spare.

I am not aware of any side effects of this approach.

-- 
Regards,
 Dmitry Smirnov
 GPG key : 4096R/52B6BBD953968D1B

---

Politics: a Trojan horse race.
 -- Stanisław Jerzy Lec

Attachment: signature.asc
Description: This is a digitally signed message part.

Reply via email to