[...] > Well, minimize. Thing is, without a > recovery/rollback/checkpointing > mechanism, you can't really know whether you've lost > something, and/or if > you've lost something critical. It's like returning > from holiday and > finding your front door broken. You look inside and > nothing _seems_ amiss. > But then, do you remember where Granny left her money > jar ? > > I'd think saying "rely on sync" is the wrong word. > It's more like uttering > a prayer - calms the soul, and won't do no harm, and > there are believers > who will strongly claim it did them good. You don't > really _know_, though. > > But there's nothing wrong with a good belief, mind > you :)
I'd say that if it flushes some buffers that wouldn't have otherwise been flushed, then one is indeed losing less than otherwise. Which is not to say (absent ufs logging or the zfs intent log not having been disabled) that there's any confidence that ufs will even be consistent, let alone that any particular state of transactions will be consistent; and forget about any sort of logical consistency at all from the application point of view. I've got all that. Still, a best effort is not worse than simply abandoning any data that may be in un-flushed buffers. It may or may not do a darn bit of good, but in the case in question, there was a good chance that it would (and indeed it did, since it looked like some buffers did get flushed, and everything appeared to be ok afterward). > My experience there is rather that if the 'syncing > filesystems ...' part > works, then the dump will not hang either. They tend > to go through the > same I/O drivers/devices. In fact, the 'syncing ...' > part accesses more > I/O devs than the 'dumping ...' part does (the former > goes for everything > unflushed, while the latter only attempts to get at > the dump device). We > do have some service documents explaining how to get > a dump if the box > hangs during 'syncing filesystems ...' - but none to > my knowledge that do > the opposite. > The time the dump takes, though, is known to be > "high". Right, it wasn't a question of sync or dump hanging; they both ran, except that (a) the dump took a good 10 minutes, and under the circumstances wasn't needed anyway since the problem was more or less understood, and (b) since VM was exhausted (and RAM was probably larger than swap), the dump was incomplete anyway. The problem is that one can't know in advance whether, if a situations arises where one can't log in even on the console, a dump would be worth taking or not. Therefore, the system is probably set up to default to dumping to a primary swap partition, since it might be useful. Only at the time one has to force a "panic 0" does one know whether or not the dump would actually be worth the time it takes. Another alternative would be if dumps were always interruptible with another L1-A (or break). That might be simpler than a sync option plus kernel support for it, and would probably help in that situation on x86 as well. But if anything, it's been awhile (years, or perhaps just a box with a Sun non-USB keyboard - I don't know whether it's an OS version or a hardware configuration that distinguishes between those where dumps can vs can't be interrupted) since I've run into a case where they _were_ interruptible. This message posted from opensolaris.org _______________________________________________ opensolaris-discuss mailing list opensolaris-discuss@opensolaris.org