Hi All, I wanted to post an update with some further research on the topic.
You can start on the thread here, if you wish: https://marc.info/?l=openbsd-bugs&m=175795080000738&w=2 Or, to long; didn't read: I found that having softraid RAID 1, with one drive or two in the mirror, practically guarantees corruption of FFS during heavy writes during power loss. I found that if you mount FFS, without RAID 1, with sync, it's very reliable. I have yet to be able to make FFS mounted with sync get to the point where automatic fsck does not repair the filesystem. However, mounted async, the default, it happens readily under heavy writes. Most of my heavy writes testing has involved Monero syncing the blockchain, often with Bitcoin at the same time, dd small files, dd large files, bonnie++, and iozone. But even if mounted sync, which is stable without RAID 1, RAID 1 would almost always make the partition not automatically fsckable, requiring manual intervention, and sometimes being so corrupt that kernel panics resulted. Now I had been testing with SSD drives and HDD drives thus far. These both, to my knowledge, were 4K sector drives with 512 byte sector emulation. I had a very far fetched idea that perhaps, the 4K sectors were somehow involved. And this appears to be true from my initial testing. So, I pulled out some old 512 byte sector drives and tested this. If I mount my partitions with sync, on RAID one with two 512 byte sector drives on a healthy mirror, I cannot get FFS corruption to the level of automatic fsck being incapable of fixing it. This is with three times of pulling the power under very heavy writes. The same conditions that bare FFS (no RAID) with async, or RAID 1 + FFS in sync or async will cause notable corruption. Thus, it appears the corruption issue is related to softraid + 4K sector drives. Whether 4K native drives are impacted or not, I have no idea. It appears specific to 512e. To be clear, 512 and 512e are fine with FFS, provided sync is used. Crystal Kolipe shared some great wisdom in this thread about RAID 1 that still applies, but nevertheless I was curious and wished to test further. Can someone else try to reproduce this? I would like to test the CRYPTO discipline in a similar manner, and perhaps RAID 1C. Just to note, so far all testing has been done on OpenBSD 7.7. Thank you for reading. -Henrich
