Re: regression (crash) in sysmon/acpiacad

2010-02-07 Thread Paul Goyette

On Sun, 7 Feb 2010, Jukka Ruohonen wrote:


On Sun, Feb 07, 2010 at 02:24:37PM +0100, Joerg Sonnenberger wrote:

The point is that they have specific meaning for the hardware and as I
said, are generally not changeable. For example, on my laptop, the power
LED turns orange when warning cap is reached and shuts down hard on
critical.


You can already alter the low and warning capacity, as documented in
envsys.conf(5). I am not arguing that it would be a good idea, quite the
contrary.


You can't necessarily change the values that the hardware monitors, but 
you can set the limits that are used to control the sending of event 
notifications to powerd.  (A sensor can also set the ENVSYS_MONNOTSUPP 
flag to disable changing those limits.)



If sme_set_limits() is omitted but sme_get_limits() is introduced, the
visible change would merely imply that from

$ envstat -d acpibat0
 Current  CritMax  WarnMax  WarnMin  CritMin Unit
 warn cap: 2.511 Wh ( 5.00%)
  low cap: 0.200 Wh ( 0.40%)
   charge:50.230 Wh (100.00%)

we would move to

$ envstat -d acpibat0 | grep cap
   Current  CritMax  WarnMax  WarnMin  CritMin Unit
...
 charge:50.230   2.5110.200   Wh (100.00%)


At least currently, sensors with the ENVSYS_FPERCENT flag display all of 
their limit values as percentages.  So the output would be more like


   charge:50.230 5.000%   0.400%   Wh (100.00%)


Similar thing is already done in acpitz(4), sdtemp(4), and ipmi(4). The
argumentation in favor of the percentages goes in similar vein.

As I understand it, these callbacks were introduced specifically for those
sensors that are capable of checking the limits in hardware. Supposedly
this should be the case here.


Yes, the primary rationale for these callbacks is to provide a means for 
userland to obtain (and possibly modify) the values used by the hardware 
device to raise alarms.  At the time I was writing the sdtemp(4) driver, 
it seemed silly to always compare current-vs-limit in software when the 
chip was more than happy to do that task for us, if we would only set a 
threshold value for it to use!



-
|   Paul Goyette   | PGP DSS Key fingerprint: |  E-mail addresses:  |
| Customer Service | FA29 0E3B 35AF E8AE 6651 |  paul at whooppee.com   |
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-


Re: regression (crash) in sysmon/acpiacad

2010-02-06 Thread Jukka Ruohonen
On Sun, Feb 07, 2010 at 08:30:27AM +0100, Joerg Sonnenberger wrote:
 On Sun, Feb 07, 2010 at 09:04:54AM +0200, Jukka Ruohonen wrote:
  * The following sensors should be removed: technology,
low capacity, and warning capacity. These are not really
something that should be sensed.
 
 Technology ok. I'm not too sure about low and warning, given that they
 normally can't be modified.

The idea here would be to use the sme_get_limits() and possibly
sme_set_limits(). This is exactly the rationale behind those callbacks.

This would also result a nicer output in envstat(8).

 
  * The design capacity should be the maximum of the last known full
charge capacity, which is the maximum of the present capacity. 
This is useful for checking the overall health of deteriorating
(lithium-ion) batteries.
 
 I disagree. Both batteries for my laptop had initially a higher capacity
 than designed for -- e.g. last full and design cap don't necessarily
 agree with each other.

I noticed the same thing with voltages. Yet, what is wrong with envstat(8)
or some other tool reporting last full charge capacity is 123 % of the
design capacity?

  * Sensors that have a maximum should report also percentages in
relation to these maximums. From the usability point of view, this
is probably almost always the right choice.
 
 That should be a task for userland, not the kernel.

It already is; in acpibat(4) this just implies setting the ENVSYS_FPERCENT
flag, nothing more.

- Jukka.


Re: regression (crash) in sysmon/acpiacad

2010-02-05 Thread Paul Goyette

On Fri, 5 Feb 2010, Matthias Drochner wrote:



p...@whooppee.com said:

Regarding the earlier comment about your battery charge level not
changing...  Is this still true?


Just checked again and found that the charge level is
updated now, but only every 30s. (The code in acpibat_refresh()
suggests that cached values are kept only for 5s, and I
believe that is how it did behave formerly.)


The 30-second timer is the default poll interval for sysmon_envsys, so 
we're calling the refresh routine on that basis.  I'm not totally up to 
speed on the acpibat internal stuff, so I don't know if it should be 
getting updated more frequently.



I definitely watched the battery state for a couple of
minutes yesterday, so something else must have happened.

Looking at acpibat_refresh()... the 5-seconds check uses
microtime() which is not necessarily monotonic. And, iirc,
I had booted Windows before, which sets the RTC to GMT+1,
so I had to force the clock back by a manual ntpdate.
This might have killed status updating.
Well, just a hypothesis, but I'd suggest to use
getmicrouptime() for these checks.


Yeah, good point.


-
|   Paul Goyette   | PGP DSS Key fingerprint: |  E-mail addresses:  |
| Customer Service | FA29 0E3B 35AF E8AE 6651 |  paul at whooppee.com   |
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-


Re: regression (crash) in sysmon/acpiacad

2010-02-04 Thread Paul Goyette

On Thu, 4 Feb 2010, Matthias Drochner wrote:



Hi -

my laptop (running -current) now crashes as soon as the
battery reaches warning level.
This is due to a jump to NULL in sysmon_envsys_events.c,
line 891 - an indirect call to sme-sme_refresh.
This is indeed 0 for acpiacad now, since the refresh
method was removed recently.
How is this intended to work?


Ooops.

The refresh routine was removed because we're now relying on the notify 
code to actually work.  But unfortunately we forgot to set the flag to 
avoid the call-back.


Either Jukka or I will get this fixed up shortly.


-
|   Paul Goyette   | PGP DSS Key fingerprint: |  E-mail addresses:  |
| Customer Service | FA29 0E3B 35AF E8AE 6651 |  paul at whooppee.com   |
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-


Re: regression (crash) in sysmon/acpiacad

2010-02-04 Thread Paul Goyette

On Thu, 4 Feb 2010, Paul Goyette wrote:


On Thu, 4 Feb 2010, Matthias Drochner wrote:

my laptop (running -current) now crashes as soon as the
battery reaches warning level.
This is due to a jump to NULL in sysmon_envsys_events.c,
line 891 - an indirect call to sme-sme_refresh.
This is indeed 0 for acpiacad now, since the refresh
method was removed recently.
How is this intended to work?


Ooops.

The refresh routine was removed because we're now relying on the notify code 
to actually work.  But unfortunately we forgot to set the flag to avoid the 
call-back.


Either Jukka or I will get this fixed up shortly.


Actually, Jukka _did_ set the SME_DISABLE_REFRESH flag.  Unfortunately, 
it was not being checked in one of the call-backs.  I have just added 
the missing check.


Please update to rev 1.78 of src/sys/dev/sysmon/sysmon_envsys_events.c 
and try again.  Thanks for finding this, and please accept my apologies 
for the breakage.




-
|   Paul Goyette   | PGP DSS Key fingerprint: |  E-mail addresses:  |
| Customer Service | FA29 0E3B 35AF E8AE 6651 |  paul at whooppee.com   |
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-


Re: regression (crash) in sysmon/acpiacad

2010-02-04 Thread Paul Goyette

On Thu, 4 Feb 2010, Matthias Drochner wrote:



p...@whooppee.com said:

Please update to rev 1.78 of src/sys/dev/sysmon/sysmon_envsys_events.c
 and try again.


Thanks -- I just managed to run low on battery with a new kernel.
I can confirm that the problem is fixed, the box runs well until
graceful shutdown by the sensor_battery script.

But -- since my battery was low anyway I did another test, whether
it behaves well if the userland powerd is not running. It does not.
I'd expect it to hit the cpu_reboot() call in sysmon_penvsys_event()
eventually, but it did not. It didn't even update the charge
value visible in envstat(8) -- this stayed at 6.00% (this is where
I killed powerd) even when the red battery LED on my (Dell)
laptop started to light permanently, which happens below 3% (the
low cap limit).
This used to work, so there is another regression.


OK, we'll look into it.  Since the charge value was not updating, it 
might be that the ACPI Notify isn't working here.  Perhaps we should not 
rely on this, and put the refresh() routine back...  Since I don't have 
any battery-powered devices, I'll defer to Jukka to see what we can do.



-
|   Paul Goyette   | PGP DSS Key fingerprint: |  E-mail addresses:  |
| Customer Service | FA29 0E3B 35AF E8AE 6651 |  paul at whooppee.com   |
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-


Re: regression (crash) in sysmon/acpiacad

2010-02-04 Thread Matthias Drochner

p...@whooppee.com said:
 Since the charge value was not updating, it  might be that the ACPI
 Notify isn't working here.

For the critical shutdown, a call to _BTP might help.
But anyway, from my limited experience with process control
(SCADA) systems, it makes sense to maintain a timestamp
for the last data value read (or delivered by asynchronous
notification) and force a fresh read if it is older than
a limit defined by the provider (and possibly overridden
by the consumer).

best regards
Matthias





Forschungszentrum Juelich GmbH
52425 Juelich
Sitz der Gesellschaft: Juelich
Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498
Vorsitzende des Aufsichtsrats: MinDir'in Baerbel Brumme-Bothe
Geschaeftsfuehrung: Prof. Dr. Achim Bachem (Vorsitzender),
Dr. Ulrich Krafft (stellv. Vorsitzender), Prof. Dr.-Ing. Harald Bolt,
Prof. Dr. Sebastian M. Schmidt




Re: regression (crash) in sysmon/acpiacad

2010-02-04 Thread Jukka Ruohonen
On Thu, Feb 04, 2010 at 10:15:03PM +0100, Matthias Drochner wrote:
 
 p...@whooppee.com said:
  Since the charge value was not updating, it  might be that the ACPI
  Notify isn't working here.

Since this involved running on battery power, I doubt it is about the
removal of the refresh routine in acpiacad(4). If the sensor value changes
when one plugs/unplugs the AC, it is easily verified to be working.

 For the critical shutdown, a call to _BTP might help.

The _BTP is just a custom warning trip-point that triggers a Notify once
reached. It is probably there to provide user space applications some
control over the limits, and to possibly avoid polling of the values.

Note though that nothing has changed in acpibat(4) with regards to the
refresh routine or the sensors generally.

 But anyway, from my limited experience with process control
 (SCADA) systems, it makes sense to maintain a timestamp
 for the last data value read (or delivered by asynchronous
 notification) and force a fresh read if it is older than
 a limit defined by the provider (and possibly overridden
 by the consumer).

Something like is already done in acpibat(4).

- Jukka.