I just did a little experiment. I installed OpenBSD 5.0 on the same machine where I had my adventure with NetBSD. This time, I broke up the world into separate filesystems, which OpenBSD facilitates, mounting only /home and /tmp async, noatime. All the others were mounted softdep,noatime. I downloaded ports.tar.gz and un-tarred it into my home directory (I had previously un-tarred it into /usr). I then did
rm -rf ports which takes awhile. While that was going, I hit the power button (I can afford to lose a filesystem containing only my home directory; it's backed up thoroughly, because I rsync it from one machine to another; there are current copies on several other machines). The system did a rapid shutdown without sync'ing the filesystems. On restart, all the softdep-mounted filesystems had no errors in fsck, as one might expect (especially since there was no intensive write-activity going on when I improperly shut the system down, as there was in /home), but I got an "Unexpected inconsistency" error in my home directory and requested a manual fsck; the system dropped into single-user mode after the automatic fscks finished. I ran the fsck on the filesystem that gets mounted as /home and there were a number of files and directories that were apparently half-deleted and it asked me one-by-one if I wanted to delete them. I did with a few, but then used the 'F' option to do so without further interaction (I don't believe the NetBSD fsck gave me that option; it is not documented in the NetBSD fsck man page, while it *is* documented in the OpenBSD fsck man page). The fsck completed and marked the filesystem clean. I rebooted, everything mounted normally, and a check of my home directory shows everything intact, even most of the ports directory, whose deletion I deliberately interrupted. The async warning in the OpenBSD mount page reads as follows: async Metadata I/O to the file system should be done asynchronously. By default, only regular data is read/written asynchronously. This is a dangerous flag to set since it does not guarantee to keep a consistent file system structure on the disk. You should not use this flag unless you are prepared to recreate the file system should your system crash. The most common use of this flag is to speed up restore(8) where it can give a factor of two speed increase. "does not guarantee to keep a consistent file system structure on the disk" is what I expected from NetBSD. From what I've been told in this discussion, NetBSD pretty much guarantees that if you use async and the system crashes, you *will* lose the filesystem if there's been any writing to it for an arbitrarily long period of time, since apparently meta-data for async filesystems doesn't get written as a matter of course. And then there's the matter of NetBSD fsck apparently not really being designed to cope with the mess left on the disk after such a crash. Please correct me if I've misinterpreted what's been said here (there have been a few different stories told, so I'm trying to compute the mean). I am not telling the OpenBSD story to rub NetBSD peoples' noses in it. I'm simply pointing out that that system appears to be an example of ffs doing what I thought it did and what I know ext2 and journal-less ext4 do -- do a very good job of putting the world into operating order (without offering an impossible guarantee to do so) after a crash when async is used, after having been told that ffs and its fsck were not designed to do this. The reason I'm beating on this is that I would have liked to use NetBSD for the application I have in mind, but I need the performance improvement that async provides (my tests show this; the tests also show that NetBSD async is about as fast as Linux, much faster than OpenBSD async, at least for doing a lot of writing, such as un-tarring a large tar file). This is practical if the joint probability of the system crashing *and* losing the async filesystem is low. My one little data point was discouraging -- the system crashed when using a wireless card with a common chipset (atheros) resulted in losing my network connection and then a system crash when a restart of networking was attempted (and, I had to use the atheros card because the system didn't pick up the built-in Cisco wireless device, which I think is supposed to be served by the an driver). The crash took out the filesystem, as we've been discussing. So I'd love it if my experience encourages someone to improve NetBSD ffs and fsck to make use of async practical, perhaps by drawing on what OpenBSD has done. I also realize that my situation is unusual, and with resources being scarce, there are a lot more important things to work on, that will affect a lot more people. But I'd at least like to get it in the queue. /Don Allen