> From: Richard Elling [mailto:richard.ell...@gmail.com]
> 
> On Apr 11, 2010, at 5:36 AM, Edward Ned Harvey wrote:
> >
> > In the event a pool is faulted, I wish you didn't have to power cycle
> the
> > machine.  Let all the zfs filesystems that are in that pool simply
> > disappear, and when somebody does "zpool status" you can see why.
> 
> In general, I agree. How would you propose handling nested mounts?

Not sure.  What's the present behavior?  I bet at present, if you have good
zfs filesystems mounted as subdirs of a pool which fails ... You're still
forced to power cycle, and afterward, I bet they're not mounted.

The behavior of a system after a pool is faulted is ok to suck, but
hopefully it could suck less than forcing the power cycle.  So if the OS
forcibly unmounts all those filesystems after pool fault, without forcing
the power cycle, that's an improvement.  Also, if you have processes running
that are binaries from within those unmounted filesystems, or even if they
just have open file handles in the faulted areas...  Even if you kill all
those processes -KILL so they die ungracefully, that's *still* an
improvement, because it's better than power cycling.

Heck, even if the faulted pool spontaneously sent the server into an
ungraceful reboot, even *that* would be an improvement.

At least root will be able to (a) login and (b) do a "zpool status."  These
are two useful things you can't presently do.  Heck, even "reboot" or "init
6" would be nice additions that are not presently possible.

All of the above are not amazingly important as long as you have redundant
reliable pools and so forth.  Because then faulted pools are so rare.  But
if you put ZFS onto an external removable disk ... then it's really easy to
accidentally have that disk disappear.  Bumped the power cord, or bumped the
USB cable, etc.  At present, one of my backup strategies is to "zfs send |
zfs receive" onto external disk.  And it's annoyingly common that this
flakes out for some various reason, and forces me to power cycle the
machine.  Which I can't do remotely because I can't ssh into the machine
(can't even get the login prompt) ... although it responds to ping, and if I
happen to have an ssh or vnc session already open, I can type in some
commands, as long as I don't do "zpool" or "zfs" or "df" or anything which
attempts to access the faulted pool ... but "reboot" and "init" both fail.

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to