Re: regression (crash) in sysmon/acpiacad
On Sun, 7 Feb 2010, Jukka Ruohonen wrote: On Sun, Feb 07, 2010 at 02:24:37PM +0100, Joerg Sonnenberger wrote: The point is that they have specific meaning for the hardware and as I said, are generally not changeable. For example, on my laptop, the power LED turns orange when warning cap is reached and shuts down hard on critical. You can already alter the low and warning capacity, as documented in envsys.conf(5). I am not arguing that it would be a good idea, quite the contrary. You can't necessarily change the values that the hardware monitors, but you can set the limits that are used to control the sending of event notifications to powerd. (A sensor can also set the ENVSYS_MONNOTSUPP flag to disable changing those limits.) If sme_set_limits() is omitted but sme_get_limits() is introduced, the visible change would merely imply that from $ envstat -d acpibat0 Current CritMax WarnMax WarnMin CritMin Unit warn cap: 2.511 Wh ( 5.00%) low cap: 0.200 Wh ( 0.40%) charge:50.230 Wh (100.00%) we would move to $ envstat -d acpibat0 | grep cap Current CritMax WarnMax WarnMin CritMin Unit ... charge:50.230 2.5110.200 Wh (100.00%) At least currently, sensors with the ENVSYS_FPERCENT flag display all of their limit values as percentages. So the output would be more like charge:50.230 5.000% 0.400% Wh (100.00%) Similar thing is already done in acpitz(4), sdtemp(4), and ipmi(4). The argumentation in favor of the percentages goes in similar vein. As I understand it, these callbacks were introduced specifically for those sensors that are capable of checking the limits in hardware. Supposedly this should be the case here. Yes, the primary rationale for these callbacks is to provide a means for userland to obtain (and possibly modify) the values used by the hardware device to raise alarms. At the time I was writing the sdtemp(4) driver, it seemed silly to always compare current-vs-limit in software when the chip was more than happy to do that task for us, if we would only set a threshold value for it to use! - | Paul Goyette | PGP DSS Key fingerprint: | E-mail addresses: | | Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com | | Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net | | Kernel Developer | | pgoyette at netbsd.org | -
Re: regression (crash) in sysmon/acpiacad
On Sun, Feb 07, 2010 at 08:30:27AM +0100, Joerg Sonnenberger wrote: On Sun, Feb 07, 2010 at 09:04:54AM +0200, Jukka Ruohonen wrote: * The following sensors should be removed: technology, low capacity, and warning capacity. These are not really something that should be sensed. Technology ok. I'm not too sure about low and warning, given that they normally can't be modified. The idea here would be to use the sme_get_limits() and possibly sme_set_limits(). This is exactly the rationale behind those callbacks. This would also result a nicer output in envstat(8). * The design capacity should be the maximum of the last known full charge capacity, which is the maximum of the present capacity. This is useful for checking the overall health of deteriorating (lithium-ion) batteries. I disagree. Both batteries for my laptop had initially a higher capacity than designed for -- e.g. last full and design cap don't necessarily agree with each other. I noticed the same thing with voltages. Yet, what is wrong with envstat(8) or some other tool reporting last full charge capacity is 123 % of the design capacity? * Sensors that have a maximum should report also percentages in relation to these maximums. From the usability point of view, this is probably almost always the right choice. That should be a task for userland, not the kernel. It already is; in acpibat(4) this just implies setting the ENVSYS_FPERCENT flag, nothing more. - Jukka.
Re: regression (crash) in sysmon/acpiacad
On Fri, 5 Feb 2010, Matthias Drochner wrote: p...@whooppee.com said: Regarding the earlier comment about your battery charge level not changing... Is this still true? Just checked again and found that the charge level is updated now, but only every 30s. (The code in acpibat_refresh() suggests that cached values are kept only for 5s, and I believe that is how it did behave formerly.) The 30-second timer is the default poll interval for sysmon_envsys, so we're calling the refresh routine on that basis. I'm not totally up to speed on the acpibat internal stuff, so I don't know if it should be getting updated more frequently. I definitely watched the battery state for a couple of minutes yesterday, so something else must have happened. Looking at acpibat_refresh()... the 5-seconds check uses microtime() which is not necessarily monotonic. And, iirc, I had booted Windows before, which sets the RTC to GMT+1, so I had to force the clock back by a manual ntpdate. This might have killed status updating. Well, just a hypothesis, but I'd suggest to use getmicrouptime() for these checks. Yeah, good point. - | Paul Goyette | PGP DSS Key fingerprint: | E-mail addresses: | | Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com | | Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net | | Kernel Developer | | pgoyette at netbsd.org | -
Re: regression (crash) in sysmon/acpiacad
On Thu, 4 Feb 2010, Matthias Drochner wrote: Hi - my laptop (running -current) now crashes as soon as the battery reaches warning level. This is due to a jump to NULL in sysmon_envsys_events.c, line 891 - an indirect call to sme-sme_refresh. This is indeed 0 for acpiacad now, since the refresh method was removed recently. How is this intended to work? Ooops. The refresh routine was removed because we're now relying on the notify code to actually work. But unfortunately we forgot to set the flag to avoid the call-back. Either Jukka or I will get this fixed up shortly. - | Paul Goyette | PGP DSS Key fingerprint: | E-mail addresses: | | Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com | | Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net | | Kernel Developer | | pgoyette at netbsd.org | -
Re: regression (crash) in sysmon/acpiacad
On Thu, 4 Feb 2010, Paul Goyette wrote: On Thu, 4 Feb 2010, Matthias Drochner wrote: my laptop (running -current) now crashes as soon as the battery reaches warning level. This is due to a jump to NULL in sysmon_envsys_events.c, line 891 - an indirect call to sme-sme_refresh. This is indeed 0 for acpiacad now, since the refresh method was removed recently. How is this intended to work? Ooops. The refresh routine was removed because we're now relying on the notify code to actually work. But unfortunately we forgot to set the flag to avoid the call-back. Either Jukka or I will get this fixed up shortly. Actually, Jukka _did_ set the SME_DISABLE_REFRESH flag. Unfortunately, it was not being checked in one of the call-backs. I have just added the missing check. Please update to rev 1.78 of src/sys/dev/sysmon/sysmon_envsys_events.c and try again. Thanks for finding this, and please accept my apologies for the breakage. - | Paul Goyette | PGP DSS Key fingerprint: | E-mail addresses: | | Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com | | Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net | | Kernel Developer | | pgoyette at netbsd.org | -
Re: regression (crash) in sysmon/acpiacad
On Thu, 4 Feb 2010, Matthias Drochner wrote: p...@whooppee.com said: Please update to rev 1.78 of src/sys/dev/sysmon/sysmon_envsys_events.c and try again. Thanks -- I just managed to run low on battery with a new kernel. I can confirm that the problem is fixed, the box runs well until graceful shutdown by the sensor_battery script. But -- since my battery was low anyway I did another test, whether it behaves well if the userland powerd is not running. It does not. I'd expect it to hit the cpu_reboot() call in sysmon_penvsys_event() eventually, but it did not. It didn't even update the charge value visible in envstat(8) -- this stayed at 6.00% (this is where I killed powerd) even when the red battery LED on my (Dell) laptop started to light permanently, which happens below 3% (the low cap limit). This used to work, so there is another regression. OK, we'll look into it. Since the charge value was not updating, it might be that the ACPI Notify isn't working here. Perhaps we should not rely on this, and put the refresh() routine back... Since I don't have any battery-powered devices, I'll defer to Jukka to see what we can do. - | Paul Goyette | PGP DSS Key fingerprint: | E-mail addresses: | | Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com | | Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net | | Kernel Developer | | pgoyette at netbsd.org | -
Re: regression (crash) in sysmon/acpiacad
p...@whooppee.com said: Since the charge value was not updating, it might be that the ACPI Notify isn't working here. For the critical shutdown, a call to _BTP might help. But anyway, from my limited experience with process control (SCADA) systems, it makes sense to maintain a timestamp for the last data value read (or delivered by asynchronous notification) and force a fresh read if it is older than a limit defined by the provider (and possibly overridden by the consumer). best regards Matthias Forschungszentrum Juelich GmbH 52425 Juelich Sitz der Gesellschaft: Juelich Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498 Vorsitzende des Aufsichtsrats: MinDir'in Baerbel Brumme-Bothe Geschaeftsfuehrung: Prof. Dr. Achim Bachem (Vorsitzender), Dr. Ulrich Krafft (stellv. Vorsitzender), Prof. Dr.-Ing. Harald Bolt, Prof. Dr. Sebastian M. Schmidt
Re: regression (crash) in sysmon/acpiacad
On Thu, Feb 04, 2010 at 10:15:03PM +0100, Matthias Drochner wrote: p...@whooppee.com said: Since the charge value was not updating, it might be that the ACPI Notify isn't working here. Since this involved running on battery power, I doubt it is about the removal of the refresh routine in acpiacad(4). If the sensor value changes when one plugs/unplugs the AC, it is easily verified to be working. For the critical shutdown, a call to _BTP might help. The _BTP is just a custom warning trip-point that triggers a Notify once reached. It is probably there to provide user space applications some control over the limits, and to possibly avoid polling of the values. Note though that nothing has changed in acpibat(4) with regards to the refresh routine or the sensors generally. But anyway, from my limited experience with process control (SCADA) systems, it makes sense to maintain a timestamp for the last data value read (or delivered by asynchronous notification) and force a fresh read if it is older than a limit defined by the provider (and possibly overridden by the consumer). Something like is already done in acpibat(4). - Jukka.