Date:        Thu, 29 Jan 2026 14:18:18 +0100
    From:        Michael van Elst <[email protected]>
    Message-ID:  <[email protected]>

  | On Thu, Jan 29, 2026 at 07:35:17AM -0500, [email protected] wrote:
  | > Can you confirm that a non-autoconfigured RAID set is not dependent on
  | > the order of devices listed in the DISKS section of the "raidN.conf" file?
  |
  | It wouldn't depend on the exact order, but on the devices listed.

Are you sure about that?

Certainly (at least how I read the sources) the order listed in the
config file (when autoconfiguration is not used) is always the order
in which the columns are configured (the first device listed will
become column 0, the second column 1, etc).   I can believe that
which device is which column might not be all that important to a mirror
(raid level 1), I would assume that it is critical to a ccd type raid
(raid level 0), and I don't really understand the internal operations
of the more complex raid levels to know for which, if any, it matters.

But raidctl(8) does say:

     Note that it is imperative that the order of
     the components in the configuration file does not change between
     configurations of a RAID device.  Changing the order of the components
     will result in data loss if the set is configured with the -C option.  In
     normal circumstances, the RAID set will not configure if only -c is
     specified, and the components are out-of-order.

which suggests (unless someone is randomly doing raidctl -C on an already
configured raid set) that it would be safe enough to have the disks in the
wrong order, and the raid set simply won't configure in that case (so a
little trial and error on the ordering in the config file should fix things).

We should probably have a tool that will dump the (important parts of)
the raid headers of unconfigured raidframe component devices, so that
which device matches which raid set, and which column number it is,
can be ascertained, without resorting to guesswork, but at present,
(aside from hexdump (or od) of the device) I don't believe we have that.

What I'd recommend is to turn on RAID autoconfiguration on the old system,
before removing the devices (if you like, just immediately before doing
so - no need to reboot and use it, just shutdown and remove the drives)
then when first connected to the new system (or via different hardware to
the current one, whichever it is) the autoconfig will operate, and configure
the raidframe properly (autoconfig doesn't care which order the components
are discovered in, it orders them according to their raidframe labels).
At that point, you can simply "raidctl -s raidN" to find out which device
name is associated with which column in the raid set, and use that info
to populate the raidN.conf file, and then disable autoconfiguration again,
if that's how you prefer (or need) to run things.

  | When constructing the RAID it validates that all disks share the
  | same serial number. If a single disk has a different label, it's
  | treated as a failed component and the raid comes up as degraded.

That wasn't relevant to the question asked, or the situation asked about,
which was taking a (working) raidframe set from one system, and plugging
it in again (elsewhere?) using an entirely different connection mechanism,
in which the order in which the device names will be assigned isn't clear
(eg: usb sd devices seem to be given numbers based upon a coin toss sometimes
(more likely timing) - boot the system twice (with enough usb drives connected)
without changing the hardware, and the sdN names aren't always the same from
one boot to the next.

  | N.B. raidctl understands wedge names, if the disks are using e.g.
  | a GPT, you could give them unique names to avoid any ambiguity.

This I highly support - but in the situation in question, might not
be practical.  If the existing drives don't already have GPT, then
they can only be converted if there is sufficient space to do so.
At the start of each drive there is probably no issue, a minimal GPT
partition needs only be a few sectors (use gpt create/migrate -p 4),
and alignment considerations usually mean there are plenty free,
- but that same number of sectors (-1, the PMBR is not duplicated)
must be free at the end of the drive for the secondary GPT label, and
existing MBR/disklabel based drives don't necessarily have that.

But if GPT can be used, having the labels indicate which raid set number,
and which column this partition is, makes all of this trivially easy.

My config looks like:

raid2:                  L1 NAME=wd1_raid2_1 NAME=ccd1_raid2_2
raid3:                  L1 COMPONENT_FAILED NAME=wd6_raid3_2
raid4:                  L1 NAME=ld0_raid4_1 NAME=ld1_raid4_2
raid5:                  L5 NAME=Raid5_C0 NAME=Raid5_C1 NAME=Raid5_C2
raid15:                 L5 NAME=Raid15-C0 NAME=Raid15-C1 NAME=Raid15-C2

[Aside: that output is from a modified version of mouse's raidstat script,
the large white space area would say "AutoConf" (or similar) in appropriate
cases.]

You can tell I swapped label naming styles, to eliminate use of
the device names ... what is there as wd6 for example is actually
wd4 now, and having the label say wd6 can be confusing (that's a
different drive).   But all of those have the raid set number, and column
number, as part of the label (though again, the older labels use base
1 numbering for the columns, rather than base 0, and raidframe uses
base 0, also confusing, so that also changed.)    I should update all
the old labels to the newer style.

[Aside, the COMPONENT_FAILED for raid3 isn't really failed, it is
"absent" - raid3 isn't in use, yet, the plan is to populate it as
is with copies of existing partitions, and then repurpose the space
where the originals are now to become that absent component, but that's
been waiting to happen for more than a year now...]

Anyway, using wedge labels, and naming like the above, it is almost
impossible to get the order incorrect (accidentally) in the config file.

I am also not using autoconfig - I have 2 SATA controllers, the (SATA)
raid sets are all split between them, and in the early days of this
system, the 2nd controller would sometimes simply "vanish" (act as if
it was not present) until booted with a bizarre firmware config (with
which NetBSD wouldn't actually work) after which it would remain present
until after the next power off (ie: survive resets) - and I got tired
of my system booting, autoconfiguring the raids, all (sata ones) with
a missing component, which then needed (lengthy) reconstruction after
the firmware dance was done to make the 2nd SATA controller reappear.
So, I disabled raid autoconfig, and added a rc.d script that runs
very early, which checks whether the 2nd SATA controller was detected
by the kernel or not, and if not, simply aborts the boot - which runs
before rc.d/raidframe so no raid config ever happens if the 2nd controller
is missing, and the mod counters don't get out of sync because of that.

I should undo all that now, that controller was replaced, and since then
the problem described never happened again.   However, inertia rules.

kre


Reply via email to