On Wed, Sep 10, 2003 at 01:36:32AM +0200, Arnt Karlsen wrote: > > But for an unattended server, most of the time it's probably better to > > force the system to reboot so you can restore service ASAP. > > ..even for raid-1 disks??? _Is_ there a combination of raid-1 and > journalling fs'es for linux that's ready for carrier grade service?
I'm not sure what you're referring to here. As far as I'm concerned, if the filesystem is inconsistent, panic'ing and letting the system get back to a known state is always the right answer. RAID-1 shouldn't be an issue here. Unless you're talking about *software* RAID-1 under Linux, and the fact that you have to rebuild mirror after an unclean shutdown, but that's arguably a defect in the software RAID 1 implementation. On other systems, such as AIX's software RAID-1, the RAID-1 is implemented with a journal, so that there is no need to rebuild the mirror after an unclean shutdown. Alternatively, you could use a hardware RAID-1 solution, which also wouldn't have a problem with an unclean shutdowns. In any case, the speed hit for doing an panic with the current Linux MD implementation is a performance issue, and in my book reliability takes precedence over performance. So yes, even for RAID-1, and it doesn't matter what filesystem, if there's a problem, you should reboot. If you don't like the resulting performance hit after the panic, get a hardware RAID controller. > > I'm not sure what you mean by this. When there is a filesystem error > > ..add an "healthy" dose of irony to repair in "repair". ;-) > > > detected, all writes to the filesystem are immediately aborted, which > > ...precludes reporting the error? No, if you are using a networked syslog daemon, it certainly does preclode reporting the error. If you mean the case where there is a filesystem error on the partition where /var/log resides, yes, we consider it better to abort writes to the filesystem than to attempt to write out the log message to a compromised filesystem. > .._exactly_, but it is not reported to any of the system users. > A system reboot _is_ reported usefully to the system users, all > tty users get the news. The message that a filesystem has been remounted read-only is logged as a KERN_CRIT message. If you wish, you can configure your syslog.conf so that all tty users are notified of kern.crit level errors. That's probably a good thing, although it's not clear that a typical user will understand what to do when they are a told that a filesystem has been remounted read-only. Certainly it is trivial to configure sysklogd to grab that message and do whatever you would like with it, if you were to so choose. If you want to "honk the big horn", that is certainly within your power to make the system do that. If you believe that Red Hat should configure their syslog.conf files to do this by default, feel free to submit a bug report / suggestion with Red Hat. > > of uncommitted data which has not been written out to disk.) So in > > general, not running the journal will leave you in a worse state after > > rebooting, compared to running the journal. > > ..it appears my experience disagrees with your expertize here. > With more data, I would have been able to advice intelligently > on when to and when not to run the journal, I believe we agree > not running the journal is adviceable if the system has been > left limping like this for a few hours. How long the system has been left limping doesn't really matter. The real issue is that there may be critical data that has been written to the journal that was not written to the filesystem before the journal was aborted and the filesystem left in a read-only state. This might, for example, include a user's thesis or several year's of research. (Why such work might not be backed up is a question I will leave for another day, and falls into the "criminally negligent system administrator" category....) In general, you're better off running the journal after a journal abort. You have may think you have experiences to the contrary, but are you sure? Unless you snapshot the entire filesystem, and try it both ways, you can't really know for sure. There are classes of errors where the filesystem has been completely trashed, and whether or not you run the journal won't make a bit of difference. The much more important question is to figure out why the filesystem got trashed in the first place. Do you have marginal memory? hard drives? Are you running a beta-test kernel that might be buggy? Fixing the proximate cause is always the most important thing to do; since in the end, no matter how clever a filesystem, if you have buggy hardware or buggy device drivers, in the end you *will* be screwed. A filesystem can't compensate for those sorts of shortcomings. > ..and, on a raid-1 disk set, a failure oughtta cut off the one bad > fs and not shoot down the entire raid set because that one fs fails. I agree. When is that not happening? > ..sparse_super is IMNTHOAIME _not_ worth the saved disk space, > and should _not_ be the default setup option. Interesting assertion. I disagree; if you'd like to back up this assertion with some arguments, I'll be happy to discuss it. I will note that a 128 meg filesystem still has half a dozen backup superblocks, which should be more than enough to recover from a disk error. For truly large filesystems without sparse_super, the disk space consumed is order O(n**2), which means that for a filesystem which is 64GB and using 1k blocks (say because it is used for storing Usenet articles), 50% of the space --- 32GB out of the 64GB --- will be consumed by backup copies of the superblock and block group descriptors. There is a very good reason why sparse_super is turned on by default. > ..180 days is IMNTHOAIME _much_ too long between fsck's. Reboots > defeats the point with /usr/bin/uptime and cause downtime, too. This is configurable, and ultimately is up to each individual system administrator. Many people complain bitterly about the forced fsck checks. I will note that much depends on your hardware. If you have quality hardware, and you're running a good, stable, well-tested production kernel, in practice fsck should never turn up any errors, and how often you run it is simply a function of how paranoid you're feeling. You should not be depending on fsck to find any problems. If you are, then there's something desperately wrong with your system, and you should find and fix the problems. If you are using EVMS or some other system where you can take read-only snapshots, something which you *can* do is to periodically (Tuesday morning at 3am for example), have a cron script take a read-only snapshot, and then run e2fsck on the read-only snapshot and then discard the snapshot. If the e2fsck returns no errors, you can use tune2fs to set the last checked time on the mounted filesystem. - Ted -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]