On Fri, 28 Jul 2006, Daniel Richard G. wrote:
> On Fri, 2006 Jul 28 16:12:35 -0300, Henrique de Moraes Holschuh wrote:
> > There is no tradeoff without the hack, and the hack is only needed in
> > hardware unsuitable for UPS management.  Thus, it must be optional.  It is
> > dangerous to data and the hardware, so it should not be the default.
> 
> Define "(un)suitable for UPS management." Does this definition include
> most people's desktop systems?

Suitable for UPS management:
        Load:
                Powers up when AC returns
                Can be informed that it must shutdown by the UPS
                        (through NUT).
        UPS:
                Does delayed load shutdown upon shutdown command
                Does not power up the load before it has enough charge
                        to do a delayed shutdown, plus safety margin.
                Always power-cycles the load after a shutdown command is
                        ACK'ed to the controlling host.  Even if AC
                        returns, and it doesn't need to shutdown anymore.
                Communicates the host when battery charge is below a
                        certain threshold, so that it can shutdown safely.
                Powers up the load if the batteries have enough charge,
                        and an AC cycle happens while the load is offline.
                Powers up the load after a timer expires, if no AC cycles
                        happen AND the load was broght offline by an explicit
                        delayed shutdown command.

Anything else is unsuitable.  Any PC97 desktop should be suitable for proper
UPS management.  And just FYI, PC97 requires WoL on all ethernet devices,
not that you need WoL for a proper UPS setup, but you somehow got the idea
that WoL was a server-grade feature...

> > You have transient responses to power cuts.  Watch in an osciloscope,
> > computer hardware is not a resistive load.
> 
> No, but any decent power supply will present a load pretty close to it, 

Only ones with PFC. 

> production server-room environments.) If someone's got a rack setup where a 
> UPS power cutoff will fry everything, they've got a much bigger problem 
> than what we're discussing here.

Yes.

> number of machines connected, but large numbers of machines connected are 
> not exactly a typical scenario.

No, but your hard-drive doing emergency unloads is a typical scenario, and
desktop HDs don't like those unloads *at* *all*.  Do not do it (and as I
already said, the only proper way to know the HD heads are unloaded requires
kernel cooperation, and it is NOT done by userspace currently). 

I know you were under the mistaken impression that we could guarantee all
HD heads were unloaded in userspace, and before halt runs.  We not only
cannot do it, we also do not *attempt* to do it.  The only thing in Debian
initscripts that really tries to take care of HD head unloads is the halt
command.

You can, of course, try to make sure hdparm was run and actually uloaded all
heads for your particular configuration, but it is not an acceptable
default, because we cannot get it right every time.  So implement it as an
admin-enabled, admin-configured option by all means.  But *not* as a
default.

> > > All of which can be done (and already is, I believe). The only thing that 
> > > the system is doing while waiting for poweroff is "sleep 15m; 
> > > reboot"---no 
> > > disks need to be spinning for that.
> > 
> > If you did not call halt, plus told the kernel to shutdown the devices, no,
> > it was *not* done.
> > 
> > And the kernel is the *only* thing that really knows how to properly
> > powerdown the devices.  Currently, we cannot ask it to do so from userspace
> > easily, and if we did, we could not access the disks anymore for example.
> 
> We have "hdparm -Y". We can't access the disk after that, but we shouldn't 
> need to. What more shutdown magic do you need on a hard disk that is not 
> spinning?

None.  If the disk spun down, but hdparm doesn't work for all disks.  And we
cannot reliably spin down all disks and uload heads from userspace, for all
possible configurations.  Thus, anything that relies on this cannot be made
a default.

> If you're talking about a flaky hardware RAID array where you can't stop 

SCSI plus all software RAID arrays.

> > The issue is how the initscript behaves if the NUT shutdown command doesn't
> > kill everything to kingdon come in 5 seconds.  In fact, a proper UPS is
> > going to be programmed to actually *delay* the powerdown load command for
> > enough time to allow the load to try to powerdown for real by itself.
> 
> Assuming things are as I had in my patch, the idea is to have all machines 
> connected to a given UPS configured with a similar wait-until-poweroff- 
> else-reboot time (if they don't shutdown straightaway).

The bad thing in your patch is that the maintainer made it non-optional, and
the default.  I understand it will not be a default anymore, which is enough
for me.

> Anyway, the disagreement comes down to this:
> 
> Me: Keep the system minimally running, so that it powers off when the UPS 
> cuts the power, so that it will turn on again when the power returns, given 
> the default behavior and limitations of PC hardware. Do sensible steps to 
> avoid data loss (stop the disks, etc.). Have this be the default, as PC 
> users are the common case.
> 
> You: Do a normal system shutdown. Rely on server-grade features (e.g. WOL 

No.  Me:  make the whole behaviour you want *optional*, and not the default,
because it is dangerous and we don't have a lick of a chance of making it
safe for all setups.

> packet from a networked UPS) to resume operation, or an "On/Off state: ON"

No.  Rely on standard PC97 ACPI desktop BIOS option "always power on on AC
return", which is the correct way to deal with machines that need to restart
when an UPS powers it up again.

> BIOS setting (despite the problems associated with that). Have this be the 
> default, as the risk of data loss from fragile storage media trumps that of 
> system unavailability after an extended outage.

No.  This is a local decision done by a local admin.  It cannot be a default
setting for Debian.  The Debian default must be the *safest* choice we have.

> Mr. Quette will have to decide this, but I don't think you've made a strong 
> case for a power-cut being significantly detrimental to data or hardware. 

I have not seen you make a case at all for a *default* behaviour.  You don't
need it to be default, you just need it to exist.

> I'm getting the impression that "hardware that cannot do it properly," as 
> you mean it, includes most PCs and non-server machines. Your view carries 
> the day if NUT's userbase is not mostly these.

It has been at least five years since I've last seen a desktop PC that is
incapable of "always power on", with the exception of some laptops.   I am
not buying your assertion that most desktop PCs cannot do it properly, but
even if this were true, it would still be a dangerous, unacceptable default
behaviour for NUT to do what you proposed.

-- 
  "One disk to rule them all, One disk to find them. One disk to bring
  them all and in the darkness grind them. In the Land of Redmond
  where the shadows lie." -- The Silicon Valley Tarot
  Henrique Holschuh


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]

Reply via email to