On Fri, Sep 22, 2000 at 10:38:25PM +0200, [EMAIL PROTECTED] wrote:
>
> Problem about reading for a couple days is that this implies user's
> job is knowing everything about system administration. This is
> possible if eiuser is a consultant or user is a system administrator
> in a big compnay so there are hundred people around user going with
> the task of making money for the company. If company is three of four
> persons or if user is a private individual this kind of "learning
> overhead" is unacceptable (no time left for real work).
There is no perfect solution to this problem, and there never will be.
Imagining that filesystem-stability nirvana exists will only cause
frustration.
No matter what filesystem you have, no matter what hardware you have,
and no matter how well-put-together the distribution, unless you've
got a contract with the universe insuring that nothing untoward will
happen in the vicinity of your machine, there will always be
possibilities for types of filesystem corruption for which the
standard tools will be insufficient and for types of filesystem
corruption for which the even the best of gurus will have little if
any success in recovering data.
Some problems can be avoided by good system design, some problems can
be automatically fixed given a well-designed distribution, some
problems will require manual but easier-to-understand intervention,
some problems will require the intervention of gurus, and some
problems could require tens or hundreds of thousands of dollars for
clean-room data recovery.
To approach a problem that is inherently not perfectly solvable and
simply complain about that fact does no good to anyone. A better
way of looking at things is to see how each player can improve his
situation and the situation of others. For instance:
Filesystem writers and writers of filesystem recovery tools:
Make the filesystem better able to deal with types of
corruption that are reported. Improve error messages
in the recovery tools, perhaps incorporating some amount
of documentation into the tool itself.
(A useful tool for my job in supporting another unix,
is a tool that makes a copy of all the filesystem metadata,
so that a filesystem can be ftp'd to a guru and the damage
understood and bugs fixed, without revealing any private
data other than directory structure and file names and
permissions. That might be a useful tool to have for
ext[23] and Reiserfs and it would provide a more direct
way for users to present known problems to the programmers.)
Distribution makers:
Emphasise stability over speed when making suggestions
for filesystem types during install.
Include references to documentation in the root filesystem, (!)
when startup scripts drop an admin into a shell for running
fsck manually at boot.
Use the most conservative hdparm settings. Offer the user
a tool to set and test other settings. (Perhaps some sort
of almost filesystem regression test that can be done in a
temporarily-created partition on each hard drive. I for
one wouldn't mind leaving my machine on overnight as these
tests were done.)
Realize that untrained users will be acting as system
administrators. Be sure common failure modes are structured
such that these users, even if they don't fully understand
what's going on, can be at least somewhat informed as to
the basics and how they can get help.
System administrators:
Keep good backups. Verify backup integrity. Be sure you
have a recovery plan that can work. Verify backup integrity.
Be sure you have all the information and media required to
quickly recover if need be.
Send a running transaction log to another system if a
day-old-restore of your system isn't good enough. (I don't
know very much about those issues, but if you know in advance
that system downtime would mean your company would lose data
and order information, then that's smoking-gun proof that
you need to fix your backup and recovery strategy.)
Businesses:
Realize that downtime is extremely expensive. Be willing
to allocate time and money to investigating and verifying
disaster recovery procedures.
Users:
Realize that you're a system administrator even if you aren't
interested in being one. Reading up on the workings of the
system will improve your chances of recovering it if things
go wrong. If there's little that's important to you on the
system, then it's not such a big deal. If you're not really
into computers but yet have the only copy of your life's work
on the machine and you don't know anyone who could help you
if things go horribly wrong, you may want to find and print
out a copy of people to call or companies to go to if the
worst happens.
Don't be fooled by pretty interfaces--Eye candy on the surface
does not imply a stable filesystem underneath. Even Macs can
have filesystem problems that require expert help to recover from.
(Okay, the "user" part is the hardest to write, and it's the
most unfair one of all, as it implies more responsibility
than even I think home users should be expected to shoulder
very often, even if they're really responsible in fact.)
I would think dividing up the responsibilities in such a way
as the above when thinking through the issue is a better way
to approach the problem than simply throwing ones hands up in
the air. It will also increase the likelihood of helpful
suggestions (or patches) being sent to the best group.
-Mark Shewmaker
[EMAIL PROTECTED]
_______________________________________________
Redhat-devel-list mailing list
[EMAIL PROTECTED]
https://listman.redhat.com/mailman/listinfo/redhat-devel-list