Hi! Summary: raid set reconstruction fails with "error rewriting parity" for sets with non-root autoconfigure enabled, works when disabled. It seems as if there is a bug when reading the the component label.
Details: I'm running a OpenBSD 3.9 GENERIC kernel with RAID enabled. That is, no other changes but the ones from raid(4): pseudo-device raid 4 option RAID_AUTOCONFIG I'm running a raid1 mirror of both ide channel master devices. After a complete disk failure of wd1, I replaced the faulty drive and rebootet (came up in degraded mode on wd0 just fine) and did fdisk/disklabel to match wd0 layout which looks as in "Auto-configuration and Root on RAID" of raidctl(8): wd[01]a: minium openbsd install (/bsd is RAID capable kernel) wd[01]e: raid0 (raid0a is /) wd[01]f: raid1 (raid1b is swap) wd[01]g: raid2 (raid2d is /usr, raid2e is /var, ...) All raid sets are set to autoconfigure, raid0 as root autoconfigure. Then I tried to resync using the method from raidctl(8), bottom of "Dealing with Component Failures", i.e.: # raidctl -a /dev/wd1e raid0 # raidctl -F component1 raid0 # raidctl -a /dev/wd1f raid1 # raidctl -F component1 raid1 # raidctl -a /dev/wd1g raid2 # raidctl -F component1 raid2 Only rebuilding root autoconfigued raid0 set succeeded. Non-root sets raid1 and raid2 failed with raidctl: ioctl (RAIDFRAME_GET_COMPONENT_LABEL). Adding a spare did work: # raidctl -a /dev/wd1g raid1 # raidctl -vs raid1 raid1 Components: /dev/wd0f: optimal component1: failed Spares: /dev/wd1f: spare Component label for /dev/wd0f: Row: 0, Column: 0, Num Rows: 1, Num Columns: 2 Version: 2, Serial Number: 298644, Mod Counter: 657 Clean: No, Status: 0 sectPerSU: 128, SUsPerPU: 1, SUsPerRU: 1 Queue size: 100, blocksize: 512, numBlocks: 1024000 RAID Level: 1 Autoconfig: Yes Root partition: No Last configured as: raid1 component1 status is: failed. Skipping label. /dev/wd1f status is: spare. Skipping label. Parity status: DIRTY Reconstruction is 100% complete. Parity Re-write is 100% complete. Copyback is 100% complete. However, failure and immediate reconstruction did not work: # raidctl -F component1 raid1 # raidctl -vs raid1 raid1 Components: /dev/wd0f: optimal component1: reconstructing Spares: /dev/wd1f: used_spare Component label for /dev/wd0f: Row: 0, Column: 0, Num Rows: 1, Num Columns: 2 Version: 2, Serial Number: 298644, Mod Counter: 658 Clean: No, Status: 0 sectPerSU: 128, SUsPerPU: 1, SUsPerRU: 1 Queue size: 100, blocksize: 512, numBlocks: 1024000 RAID Level: 1 Autoconfig: Yes Root partition: No Last configured as: raid1 component1 status is: reconstructing. Skipping label. raidctl: ioctl (RAIDFRAME_GET_COMPONENT_LABEL) failed raidctl -F subsequently fails with "error rewriting parity". To solve the problem, I disabled autoconfiguration for raid1 and raid2 with raidctl -A no raid[12] and rebootet into the rescue installation (boot -a, selected wd0a). There, raidctl -F component1 raid1 succeeded, as well as raid2. After reconstruction/parity-rewrite, I did _not_ re-enable auto- configuration for raid1 and raid2, i.e. only raid0 is configured as: raidctl -A root raid0 Reboot succeded and all raid sets are fine except that only raid0 is autoconfigured now. Is this correct behaviour/setup? raidctl(8) says about autoconfiguration: "RAID sets raid0, raid1, and raid2 are all marked as auto-configurable." I hope this description is somewhat complete. If any additional information is required, I'd be happy to provide. Regards, Walter