On Sun, Sep 07, 2003 at 07:24:27PM +0200, Arnt Karlsen wrote: > > What happens on error conditions can be set through tune2fs or as a > > mount option. Having it remount read-only is probably better than > > panicing the kernel. > > ..yeah, except in /var/log, /var/spool et al, I also lean towards > panic in /home.
I tend to use remount read-only feature on desktops, where it's useful for me to be able to save my work on some other filesystem before I reboot my system. But for an unattended server, most of the time it's probably better to force the system to reboot so you can restore service ASAP. > > When it happens a reboot may be a good idea, in which case a fsck to > > fix the problem should occur automatically. > > ..should, agrrrRRRRRRRrrreed. IME (RH73 - RH9 and woody) it does not. > > ..what happens is the journaling dies, leaving a good fs intact, > on rebooting, the dead journal will "repair" the fs wiping good > data off the fs. I'm not sure what you mean by this. When there is a filesystem error detected, all writes to the filesystem are immediately aborted, which means the filesystem on disk is left in an unstable state. (It my look consistent while the system is still running, but there is a lot of uncommitted data which has not been written out to disk.) So in general, not running the journal will leave you in a worse state after rebooting, compared to running the journal. An alternative course of action, which we don't currently support would be to attempt to write everything to disk and quiesce the filesystem before remounting it read-only. The problem is that trying to flush everything out to disk might leave things in a worse state than just freezing all writes. The real problem is that in the face of filesystem corruption, by the time the filesystem notices that something is wrong, there may be significant damage that has already taken place. Some of it may already have been written to journal, in which case not replaying the journal might leave you with more data to recover; on the other hand, not replaying the journal could also risk leaving your filesystem very badly corrupted with data which the mail server had promised it had accepted, not actually getting saved by the filesystem. A human could make a read/write snapshot of the filesystem and try it both ways, but if you want automatic recovery, it's probably better to run the journal than not to run it. > ..the errors=remount,ro fstab option remounts the fs ro but fails > to tell the system, so the system merrily "logs" data and "accepts" > mail etc 'till Dooms Day, and especially on raid-1 disks I sort of > expected redundancy, like in "autofeather the bad prop and trim out > the yaw" and "autopatch that holed fuel tank", and "auto-sync the > props", I mean, this was done _60_years_ ago in aviation to help > win WWII, and ext3 on raid-1 floats around USS Yorktown-style??? If the system merrily logs data and accepts it, even after the filesystem is remounted read-only, that implies that the MTA is horribly buggy, not doing the most basic of error return code checks. If the filesystem is remounted read-only, then writes to the filesystem *will* return an error. If the application doesn't notice, then it's the application which is at fault, not ext3. That being said, my preference for servers is to panic immediately on the first sign of trouble, and let the system fsck and come back again. Even if your MTA is non-criminally-negligent, and checks error codes, the best it can do is return a SMTP temporary failure, which still doesn't keep the mail flowing. You're probably best off rebooting the machine and restoring service. - Ted -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]