>>  
>> 
>>      It's pretty rare to corrupt a Linux filesystem by pulling
>>      the plug on it.  There's a mount option for synchronous 
> 
> My observation has been that it is very rare, but not impossible.  We
> routinely cut the power of a large number of Linux boxes.  In three
> years of this, I have twice seen a situation where the default RedHat
> startup fsck sequence couldn't recover things and dumped the sysetm
> into a single-user shell.  The systems were recovered without problems
> from there.  The script can be tweaked to do a fsck -y no matter
> what if you prefer.
> 
> Just remember to do sync (three times for the superstitious) if you
> have a situation where you are modifying a lot of files and then expect
> to have a power shutdown directly after this.

        Or use the 'sync' mount option.

        Also I would hope that anyone working on doing a turnkey or
        embedded Linux system would roll their own distribution (or
        at least go over the rc* scripts with the proverbial
        fine-toothed comb).

>>      I/O that reduces the chances even more (and exacts a
>>      corresponding cost in fs performance). 
>> 
>>      It's also pretty easy to construct an initrd and have a
>>      script/program that automatically rebuilds a filesystem if
>>      it does get corrupted.  (You can store a 'tar.gz' file on an
>>      unused partition and restore whole filesystems from there,
>>      among many options).
> 
> Exactly. You can fit enough utilities on a single-floppy system
> (see bootdisk howto) to mke2fs the active partitions, re-initialize 
> swap, and untar the images from the spare partition.  Remember to
> set the umask to 000 before you untar.
> 
> One note.  I couldn't get fdisk to run from a script.

        Don't --- store a copy of the MBR/partition table somewhere
        with a command like:

           dd if=/dev/hda  of=/somewhere/somefile  bs=512 count=1

        ... and, when you need to restore it use:

           dd of=/dev/hda  if=/somewhere/somefile  bs=512 count=1

        (Then run /sbin/lilo later, after your mkfs loop and 
        tar extractions).

        'fdisk' is just a fancy front-end for creating the partition
        table (the last "mumble" bytes of the MBR on each drive) and 
        /sbin/lilo is a tool for building and installing boot
        records (the first "mumble mumble" bytes of the MBR on the 
        primary boot drive).  /sbin/lilo also builds some mappings
        which it stores in a second stage bootloader, somewhere.

        In one of my recent Linux Gazette Answer Guy columns I
        described an "AutoRecover" configuration.  This involves
        having two bootable (root) Linux filesystems, configuring
        one as an auto-recovery system and the other as your
        production mode system.  The LILO default boot stanza would
        point at the recovery system.  A normal shutdown uses the 
        -R switch to /sbin/lilo to impose a "one boot over-ride" of
        that default.  

        The idea is that normal shutdowns boot into the production 
        system, while any power cycle, panic (use the panic= kernel
        directive to set a finite timeout on those), or watchdog
        reset will result in an abnormal shutdown (will *fail* to 
        call '/sbin/lilo -R production') which will then boot on the
        AutoRecover partiton.

        (The exact details of how the autorecover scripts attempt
        to diagnose and/or recover the other fs are left to the
        implementer ---- I'm just describing the concept here.  The
        simplest model is:  we blow everything away and restore
        from our {extra partitons, drives, network recovery file
        server, CD, or whatever}).

        After autorecovery we boot into production mode.

        It should be possible for a cron job in production mode to 
        periodically verify the integrity of the autorecovery
        partitions (pipe whole partitions through md5sum if you
        have to --- that's a *READ-ONLY* test).  Other production
        mode cron jobs could update your backups.  (You could have a 
        multi-generation scheme where a "commit" phase writes a
        checksum file to a common place.  Create a "common"
        filesystem for use by production and auto-recover subsystem
        to exchange these sorts of "states").

        You could use this "common" fs for your actual .tar files
        --- though they'd be constrained to the 2Gb limit that the 
        32-bit Linux kernels currently impose on individual file
        sizes.  If you use a raw partition, you can exceed that 
        limit (since the Linux kernels have support for "long long seek"
        calls, which is used by mkfs to create *filesystems*).

        The big limitation of my AutoRecovery configuration is that
        the common PC BIOS (and most workstation firmware PROM's) 
        are *STUPID*.  Rather than storing just the parameters of
        my disks and floppies they *should* store a list of boot
        device/block address pairs with a list of corresponding
        *checksums*.   The firmware bootloader code should "look
        before it leaps" and check the checksums (I'm paranoid, so I 
        say it does *dual* checksum checks --- independent 16 and 32 
        bit CRC's).  Then the firmware could boot from the (nth?)
        "boot entry" on it's list for which the checksum passes.

        The advantages to such a system are obvious.  First we don't 
        have to throw away a whole hard disk just because one sector 
        on track zero is bad.  Second it's much more robust against
        boot sector viruses (although the security of the NVRAM or
        CMOS registers is a different stupidity that should also be
        addressed).  Third it allows for some arbitrary number of 
        alternative boots in case of boot record/partition table 
        corruption.  (So, firmware authors --- take note:  make 
        one of these for a few existing PC's and I'll buy some).

        I described this on the Linux-kernel list and added that
        we should also then extend the idea --- to cascade our
        checksum's.  So the BR stores lists and checksums of second 
        stage bootloader(s), which stores lists and checksums of 
        valid kernels, which stores lists and checksums of valid
        copies of 'init' (and/or alternatives thereof).  A modified
        'init' could look for an /etc/inittab.sums file and check
        each of the binaries referred to by its entries before
        executing them.

        As you go down the chain you can use more sophisticated MD5
        and SHA-1 checksums.  You could also design it so that a
        failure at any stage can invoke a reboot on the next
        NVRAM/CMOS boot entry.  You can end up with an extremely
        robust and secure trusted path from firmware to 'login' ---
        with only minimal changes to the overall code and
        architectures.  (This idea is basically portable --- it
        could be used on Apple PowerMacs, Sun SPARC workstations, 
        Compaq/DEC Alpha's, etc, just about as easily as PC's).

>>      It's also pretty easy to implement and use custom
>>      filesystems for Linux.  You have plenty of examples to work
>>      from.
> 
> For you, perhaps...  Still there are ROMFS type things.
> -- cary

        Actually I'm not a programmer --- so my claim that it is
        easy is based purely on the observation that many of them
        exist for Linux --- and most have been done by individuals
        and small groups in pursuit of their hobby.

        It would not be easy for *me* --- but it obviously is easy
        enough that a number of people have done it for fun.  The
        fact that several filesystems are sitting around under BSD
        and GPL licenses also suggests that the task can't be too
        difficult, if you requisite filesystem can be derived from
        an existing one.

--
Jim Dennis  (800) 938-4078              [EMAIL PROTECTED]
Proprietor, Starshine Technical Services:  http://www.starshine.org

Reply via email to