Re: UFS2 fsck Question (semantics of -p)
We ran our experiment on top of a very simple RAM disk which does not have any caches or anything of that sort. The dmesg log is at http://keeda.stanford.edu/dmesg The resultant images are at: http://keeda.stanford.edu/ufs-umount-image http://keeda.stanford.edu/ufs-mount-sync-image If you run fsck -p on them, fsck will not be able to recover, while fsck without the -p option will be able to. Can On Aug 29, 2006, at 9:55 AM, Chuck Swiger wrote: Can Sar wrote: [ ... ] Would you consider it an error if the -p option does not fix inconsistencies caused by a simple power failure, without any hardware or software corruption? You're asking an interesting question, but the issue of data integrity depends not only on the software which comprises the OS, but also on the hardware being used. In particular, the system depends upon the hard drives to reliably report when data being written actually has been; SCSI drives, using tagged command queuing, especially in conjunction with a battery-backup which ensures the drive stays up long enough to flush it's write cache even if system power is removed, will tend to fare pretty well. IDE drives, by contrast, have a bad habit of lying about whether data has actually been written to the disk itself rather than simply making it to the write cache on the drive. (Such drives ignore the ATA "FLUSH CACHE" command, specificly.) In other words, showing that a filesystem can become inconsistent in a fashion that "fsck -p" cannot correct is interesting and a concern regardless of the circumstances, but showing it in cases where you are using battery-backed drives and/or SCSI rather than IDE is a lot more meaningful. If you are using IDE devices, your testing will be more meaningful if you disable the IDE write-cache entirely. Also, you should put your results somewhere, perhaps on a webpage with links to the filesystem images and a complete dmesg so that the OS version and hardware being used is well-documented. -- -Chuck ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: UFS2 fsck Question (semantics of -p)
Can Sar wrote: [ ... ] Would you consider it an error if the -p option does not fix inconsistencies caused by a simple power failure, without any hardware or software corruption? You're asking an interesting question, but the issue of data integrity depends not only on the software which comprises the OS, but also on the hardware being used. In particular, the system depends upon the hard drives to reliably report when data being written actually has been; SCSI drives, using tagged command queuing, especially in conjunction with a battery-backup which ensures the drive stays up long enough to flush it's write cache even if system power is removed, will tend to fare pretty well. IDE drives, by contrast, have a bad habit of lying about whether data has actually been written to the disk itself rather than simply making it to the write cache on the drive. (Such drives ignore the ATA "FLUSH CACHE" command, specificly.) In other words, showing that a filesystem can become inconsistent in a fashion that "fsck -p" cannot correct is interesting and a concern regardless of the circumstances, but showing it in cases where you are using battery-backed drives and/or SCSI rather than IDE is a lot more meaningful. If you are using IDE devices, your testing will be more meaningful if you disable the IDE write-cache entirely. Also, you should put your results somewhere, perhaps on a webpage with links to the filesystem images and a complete dmesg so that the OS version and hardware being used is well-documented. -- -Chuck ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to "[EMAIL PROTECTED]"
UFS2 fsck Question (semantics of -p)
Hi, I work on a project to automatically (by dynamically running the system) find crash recovery errors in storage systems and we are beginning to do some preliminary checking of FreeBSD. We found an "error" where on power failure the disk can get corrupted even after an operation has returned successfully on a synchronous mount. Fsck running with the -p option cannot fix this error, while running it without, does happen to fix it (in this particular test case). However, the fsck manfile says the following: "The kernel takes care that only a restricted class of innocuous file sys- tem inconsistencies can happen unless hardware or software failures intervene. These are limited to the following: Unreferenced inodes Link counts in inodes too large Missing blocks in the free map Blocks in the free map also in files Counts in the super-block wrong These are the only inconsistencies that fsck_ffs with the -p option will correct; if it encounters other inconsistencies, it exits with an abnor- mal return status and an automatic reboot will then fail. For each cor- rected inconsistency one or more lines will be printed identifying the file system on which the correction will take place, and the nature of the correction. After successfully correcting a file system, fsck_ffs will print the number of files on that file system, the number of used and free blocks, and the percentage of fragmentation." Would you consider it an error if the -p option does not fix inconsistencies caused by a simple power failure, without any hardware or software corruption? I have two example ufs2 images of such errors that you can download. http://keeda.stanford.edu/ufs-umount-image http://keeda.stanford.edu/ufs-mount-sync-image Thank you very much for your help, Can Sar ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to "[EMAIL PROTECTED]"