On Thu, 29 Jun 2023, Robert Elz wrote:
The second issue (the one I started investigating) is that (with the cpu freq at 3401, enabling turbo mode, and I assume, actual frequencies up to 5500MHz) the temperatures recorded start creeping upwards (when the system is mostly idle, and nothing is really changing at all) and what's more, that seems to be on an exponential curve (positive feedback perhaps). That is, going from (reported values of) mid 30's to around 40 as the "resting" state, can take many hours, then from 40 to 50 or so, less time, and then once it gets beyond 50 and is approaching 60, it might just be minutes until it reaches Tjmax and powerd (or the cpu itself perhaps) decides to shut things down (when powerd does it, I sometimes see its broadcast message - but I often don't have a login terminal visible, so often not) and once or twice, X has actually shut down, and I've seen at least some of the normal shutdown sequence happening on the console. Usually however, the power is (or seems to be) simply abruptly cut, and everything simply stops, instantly, working and doing things (like typing an e-mail, or whatever) one second, and no power the next. (And no, it is not an external power issue, the system has a UPS, and in any case if it lost external power, it would reboot as soon as that returned, this does not do that, it behaves just like "poweroff" but seemingly without the file system unmounting, ... that would normally happen.)
You can set a lower "critical-max" property on the CPU temps. in /etc/envsys.conf to make powerd trigger a shutdown at a lower temperature. Say, 75C? -RVP