>>
>>
>> It's pretty rare to corrupt a Linux filesystem by pulling
>> the plug on it. There's a mount option for synchronous
>
> My observation has been that it is very rare, but not impossible. We
> routinely cut the power of a large number of Linux boxes. In three
> years of this, I have twice seen a situation where the default RedHat
> startup fsck sequence couldn't recover things and dumped the sysetm
> into a single-user shell. The systems were recovered without problems
> from there. The script can be tweaked to do a fsck -y no matter
> what if you prefer.
>
> Just remember to do sync (three times for the superstitious) if you
> have a situation where you are modifying a lot of files and then expect
> to have a power shutdown directly after this.
Or use the 'sync' mount option.
Also I would hope that anyone working on doing a turnkey or
embedded Linux system would roll their own distribution (or
at least go over the rc* scripts with the proverbial
fine-toothed comb).
>> I/O that reduces the chances even more (and exacts a
>> corresponding cost in fs performance).
>>
>> It's also pretty easy to construct an initrd and have a
>> script/program that automatically rebuilds a filesystem if
>> it does get corrupted. (You can store a 'tar.gz' file on an
>> unused partition and restore whole filesystems from there,
>> among many options).
>
> Exactly. You can fit enough utilities on a single-floppy system
> (see bootdisk howto) to mke2fs the active partitions, re-initialize
> swap, and untar the images from the spare partition. Remember to
> set the umask to 000 before you untar.
>
> One note. I couldn't get fdisk to run from a script.
Don't --- store a copy of the MBR/partition table somewhere
with a command like:
dd if=/dev/hda of=/somewhere/somefile bs=512 count=1
... and, when you need to restore it use:
dd of=/dev/hda if=/somewhere/somefile bs=512 count=1
(Then run /sbin/lilo later, after your mkfs loop and
tar extractions).
'fdisk' is just a fancy front-end for creating the partition
table (the last "mumble" bytes of the MBR on each drive) and
/sbin/lilo is a tool for building and installing boot
records (the first "mumble mumble" bytes of the MBR on the
primary boot drive). /sbin/lilo also builds some mappings
which it stores in a second stage bootloader, somewhere.
In one of my recent Linux Gazette Answer Guy columns I
described an "AutoRecover" configuration. This involves
having two bootable (root) Linux filesystems, configuring
one as an auto-recovery system and the other as your
production mode system. The LILO default boot stanza would
point at the recovery system. A normal shutdown uses the
-R switch to /sbin/lilo to impose a "one boot over-ride" of
that default.
The idea is that normal shutdowns boot into the production
system, while any power cycle, panic (use the panic= kernel
directive to set a finite timeout on those), or watchdog
reset will result in an abnormal shutdown (will *fail* to
call '/sbin/lilo -R production') which will then boot on the
AutoRecover partiton.
(The exact details of how the autorecover scripts attempt
to diagnose and/or recover the other fs are left to the
implementer ---- I'm just describing the concept here. The
simplest model is: we blow everything away and restore
from our {extra partitons, drives, network recovery file
server, CD, or whatever}).
After autorecovery we boot into production mode.
It should be possible for a cron job in production mode to
periodically verify the integrity of the autorecovery
partitions (pipe whole partitions through md5sum if you
have to --- that's a *READ-ONLY* test). Other production
mode cron jobs could update your backups. (You could have a
multi-generation scheme where a "commit" phase writes a
checksum file to a common place. Create a "common"
filesystem for use by production and auto-recover subsystem
to exchange these sorts of "states").
You could use this "common" fs for your actual .tar files
--- though they'd be constrained to the 2Gb limit that the
32-bit Linux kernels currently impose on individual file
sizes. If you use a raw partition, you can exceed that
limit (since the Linux kernels have support for "long long seek"
calls, which is used by mkfs to create *filesystems*).
The big limitation of my AutoRecovery configuration is that
the common PC BIOS (and most workstation firmware PROM's)
are *STUPID*. Rather than storing just the parameters of
my disks and floppies they *should* store a list of boot
device/block address pairs with a list of corresponding
*checksums*. The firmware bootloader code should "look
before it leaps" and check the checksums (I'm paranoid, so I
say it does *dual* checksum checks --- independent 16 and 32
bit CRC's). Then the firmware could boot from the (nth?)
"boot entry" on it's list for which the checksum passes.
The advantages to such a system are obvious. First we don't
have to throw away a whole hard disk just because one sector
on track zero is bad. Second it's much more robust against
boot sector viruses (although the security of the NVRAM or
CMOS registers is a different stupidity that should also be
addressed). Third it allows for some arbitrary number of
alternative boots in case of boot record/partition table
corruption. (So, firmware authors --- take note: make
one of these for a few existing PC's and I'll buy some).
I described this on the Linux-kernel list and added that
we should also then extend the idea --- to cascade our
checksum's. So the BR stores lists and checksums of second
stage bootloader(s), which stores lists and checksums of
valid kernels, which stores lists and checksums of valid
copies of 'init' (and/or alternatives thereof). A modified
'init' could look for an /etc/inittab.sums file and check
each of the binaries referred to by its entries before
executing them.
As you go down the chain you can use more sophisticated MD5
and SHA-1 checksums. You could also design it so that a
failure at any stage can invoke a reboot on the next
NVRAM/CMOS boot entry. You can end up with an extremely
robust and secure trusted path from firmware to 'login' ---
with only minimal changes to the overall code and
architectures. (This idea is basically portable --- it
could be used on Apple PowerMacs, Sun SPARC workstations,
Compaq/DEC Alpha's, etc, just about as easily as PC's).
>> It's also pretty easy to implement and use custom
>> filesystems for Linux. You have plenty of examples to work
>> from.
>
> For you, perhaps... Still there are ROMFS type things.
> -- cary
Actually I'm not a programmer --- so my claim that it is
easy is based purely on the observation that many of them
exist for Linux --- and most have been done by individuals
and small groups in pursuit of their hobby.
It would not be easy for *me* --- but it obviously is easy
enough that a number of people have done it for fun. The
fact that several filesystems are sitting around under BSD
and GPL licenses also suggests that the task can't be too
difficult, if you requisite filesystem can be derived from
an existing one.
--
Jim Dennis (800) 938-4078 [EMAIL PROTECTED]
Proprietor, Starshine Technical Services: http://www.starshine.org