I hit this yesterday in 64-bit system running amd64 kernel version
3.2.60-1+deb7u3. I had a RAID-1 array with two attached SATA disks with the
other one nearing failure with non-zero reallocated sector count. This
array serves as a physical volume for LVM, along with another similar
array. On the LVM VG, there's a root LV with XFS.

I decided to rebuild a new disk by adding it as a third device to the array
via usb-sata adapter, thinking of replacing the failing one after rebuild.
This way I thought I'd have less risk of data loss, as I would in no point
run the array without redundancy. Then I hit this issue, and some of the
running programs started reporting corrupt data. I paniced and shut down
the device before further corruption.

With two disks attached, everything seems to work normally. Corruption was
left in files (logs, etc.) that were written during the time this issue was
happening, but otherwise there doesn't seem to be corruption and the
filesystem was cleanly mounted.

My googling around suggests this is a kernel bug that is unlikely to be
fixed in a short term. I think some sort of warning for making heterogenous
md arrays should be included to mdadm so that it would be harder to put the
system in a state that corrupts data. Apparently even this configuration
would have been fine if I did it when the array wasn't running, but I
didn't think such precaution to be necessary.

-- 
Markus Vuorio

Reply via email to