Re: [Bug 955287] Re: Ubuntu should handle "hot" CPUs by taking preemptive action and warning users
On 07/03/14 10:23, James Hunt wrote: > Hi Colin - thanks, yes running thermald has improved the situation > immensely! I do still very occasionally see overheats, although they are > extremely rare now and I suspect may be more related to my fans needing > a clean :-) > One can tweak the default thermald config to start throttling back at a lower temperature if required. So it may need some machine specific modification if it keeps on occurring. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/955287 Title: Ubuntu should handle "hot" CPUs by taking preemptive action and warning users To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/955287/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 955287] Re: Ubuntu should handle "hot" CPUs by taking preemptive action and warning users
Hi Colin - thanks, yes running thermald has improved the situation immensely! I do still very occasionally see overheats, although they are extremely rare now and I suspect may be more related to my fans needing a clean :-) -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/955287 Title: Ubuntu should handle "hot" CPUs by taking preemptive action and warning users To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/955287/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 955287] Re: Ubuntu should handle "hot" CPUs by taking preemptive action and warning users
thermald is the defacto solution to this in Trusty+, I've added a Wikipage to describe how to install and configure this daemon: https://wiki.ubuntu.com/Kernel/PowerManagement/ThermalIssues I think this addresses the bug, so I'm going to close it. ** Changed in: linux (Ubuntu) Status: Incomplete => Fix Released -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/955287 Title: Ubuntu should handle "hot" CPUs by taking preemptive action and warning users To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/955287/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 955287] Re: Ubuntu should handle "hot" CPUs by taking preemptive action and warning users
One can use a thermald helper script to set the CPU Max temperature on your machine. Attached is the script, one has to specify the max temp in milli degrees C, so for 80 degress C, enter 8, example: sudo ./thermald_set_pref.sh [sudo] password for king: thermald preference 0 : DEFAULT 1 : PERFORMANCE 2 : ENERGY_CONSERVE 3 : DISABLED 4 : CALIBRATE 5 : SET USER DEFINED CPU MAX temp 6 : TERMINATE Enter thermald preference [1..6]: 5 Enter valid max temp in mill degree celsius 8 ** Attachment added: "thermald management script" https://bugs.launchpad.net/ubuntu/+source/linux/+bug/955287/+attachment/3968543/+files/thermald_set_pref.sh -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/955287 Title: Ubuntu should handle "hot" CPUs by taking preemptive action and warning users To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/955287/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 955287] Re: Ubuntu should handle "hot" CPUs by taking preemptive action and warning users
Hmm, I'm not sure why you need to unload the thinkpad acpi driver. Can you force fan control mode with options thinkpad_acpi fan_control=1 in a /etc/modprobe.d/ .conf file and this will configure the driver at boot time. I also suggest enabling the intel-pstate driver. Ubuntu currently has a patch to turn this off by default, and it can be enabled by changing /etc/default/grub and set the following: GRUB_CMDLINE_LINUX_DEFAULT="quiet splash intel_pstate=enable" then run: sudo update-grub and reboot. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/955287 Title: Ubuntu should handle "hot" CPUs by taking preemptive action and warning users To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/955287/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 955287] Re: Ubuntu should handle "hot" CPUs by taking preemptive action and warning users
Hi Colin - thanks, I'm now running thermald but cannot really stress my system due to o/s bug 1268906 (kvm always makes it overheat unless fans are disengaged. That is currently not possible on my T410 btw due to bug 1268880 - will that impact thermald's abilities? -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/955287 Title: Ubuntu should handle "hot" CPUs by taking preemptive action and warning users To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/955287/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 955287] Re: Ubuntu should handle "hot" CPUs by taking preemptive action and warning users
@James, I've now packaged up thermald for Trusty, which will do auto throttling if the CPU is too hot. Perhaps you can give that a spin. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/955287 Title: Ubuntu should handle "hot" CPUs by taking preemptive action and warning users To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/955287/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 955287] Re: Ubuntu should handle "hot" CPUs by taking preemptive action and warning users
** Changed in: linux (Ubuntu) Status: In Progress => Incomplete -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/955287 Title: Ubuntu should handle "hot" CPUs by taking preemptive action and warning users To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/955287/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 955287] Re: Ubuntu should handle "hot" CPUs by taking preemptive action and warning users
@James, if you can email me the logs you get with tp-thermstat I can still analyze them and spot any dubious looking fan control characteristics. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/955287 Title: Ubuntu should handle "hot" CPUs by taking preemptive action and warning users To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/955287/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 955287] Re: Ubuntu should handle "hot" CPUs by taking preemptive action and warning users
Hi Colin, Since running tp-thermstat, I just cannot make my machine overheat no matter what workloads I give it (with fans set to "auto"). Will keep trying. I have used the disengaged mode and certainly couldn't force a shutdown with that either, although it did sound like I was working in a machine shop when running in that mode. Will try thermal.tzp=10 over the w/e and report back. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/955287 Title: Ubuntu should handle "hot" CPUs by taking preemptive action and warning users To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/955287/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 955287] Re: Ubuntu should handle "hot" CPUs by taking preemptive action and warning users
"However, the kernel seems to emit an ACPI event when it detects the CPU(s) are merely "hot". I suggest we consider adding an acpi hook to attempt to avoid a "critical" scenario." In fact, I suspect the kernel *will* emit a critical thermal event, but the temperature zips past this threshold and the firmware shuts the machine down before the thermal zone handler emits the event. This normally happens if the ACPI _TZP (Thermal Zone Polling) interval is not defined in the firmware, so the kernel seems to fallback to 300 centiseconds, which is way too long a polling interval to spot the over- run and do anything about it. I suggest setting the thermal zone polling interval to say 10 centiseconds using: kernel parameter: thermal.tzp=1 Please try this and see if it does a graceful shutdown with this. As a side note, I'm analyzing the thermal characteristics of some ThinkPads and looking at a way to enable the fan at high speed if we detect high temperatures. Also, it may be worth changing the default "auto" mode to "disengaged" mode. Apparently in disengaged mode the embedded controller does not monitor the fan speed and instead uses an open-loop control function that can ramp the fan up to full speed. The downside is that the fan speed is not stable but it often runs faster than the auto mode. To try this do: sudo modprobe -r thinkpad_acpi modprobe thinkpad_acpi fan_control=1 echo level disengaged | sudo tee /proc/acpi/ibm/fan Please let me know if this helps. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/955287 Title: Ubuntu should handle "hot" CPUs by taking preemptive action and warning users To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/955287/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 955287] Re: Ubuntu should handle "hot" CPUs by taking preemptive action and warning users
oops, made a typo, make the thermal parameter: thermal.tzp=10 -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/955287 Title: Ubuntu should handle "hot" CPUs by taking preemptive action and warning users To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/955287/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 955287] Re: Ubuntu should handle "hot" CPUs by taking preemptive action and warning users
Assigning to cking, as he has already been investigating numerous heat issues with a number of thinkpad machines. ** Changed in: linux (Ubuntu) Assignee: (unassigned) => Colin King (colin-king) ** Changed in: linux (Ubuntu) Status: Confirmed => In Progress -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/955287 Title: Ubuntu should handle "hot" CPUs by taking preemptive action and warning users To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/955287/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 955287] Re: Ubuntu should handle "hot" CPUs by taking preemptive action and warning users
apport information ** Tags added: apport-collected staging ** Description changed: If the kernel detects your CPU(s) is/are too hot currently (see bug 751689), the kernel calls /sbin/poweroff. This will provide a "graceful" system shutdown. If /sbin/poweroff fails, the kernel just forcibly shuts the system down. However, both strategies are last resorts and are called when the system temperature has reached a critical level. However, the kernel seems to emit an ACPI event when it detects the CPU(s) are merely "hot". I suggest we consider adding an acpi hook to attempt to avoid a "critical" scenario. Currently, the user experience when "critical" gets hit is not good - the system just shuts down with no warning whatsoever. This is alarming in the extreme to users. Ideas: - proactively attempt to kill off power hogging processes (use powertop?) - ramp fans to maximum and present the user with a warning window explaining what is happening. - present the user with a window of high-power processes and ask *them* to select the processes they'd like to kill off in an effort to avoid a system shutdown. Problems: - it is unclear (to me atleast) how close (in terms of degrees centigrade) "hot" and "critical" are (is it different for all CPUs ?) As such, it is unclear how long (time) it might take for a system that is hot to go critical and just shutdown. ProblemType: Bug DistroRelease: Ubuntu 12.04 Package: acpi-support 0.140 ProcVersionSignature: Ubuntu 3.2.0-18.29-generic-pae 3.2.9 Uname: Linux 3.2.0-18-generic-pae i686 NonfreeKernelModules: nvidia ApportVersion: 1.94.1-0ubuntu2 Architecture: i386 Date: Wed Mar 14 17:22:09 2012 InstallationMedia: Ubuntu 10.10 "Maverick Meerkat" - Release i386 (20101007) ProcEnviron: TERM=xterm PATH=(custom, user) LANG=fr_CA.UTF8 SHELL=/bin/bash SourcePackage: acpi-support UpgradeStatus: Upgraded to precise on 2012-01-12 (62 days ago) + --- + AlsaVersion: Advanced Linux Sound Architecture Driver Version 1.0.24. + ApportVersion: 1.94.1-0ubuntu2 + Architecture: i386 + ArecordDevices: + List of CAPTURE Hardware Devices + card 0: Intel [HDA Intel], device 0: CONEXANT Analog [CONEXANT Analog] +Subdevices: 1/1 +Subdevice #0: subdevice #0 + AudioDevicesInUse: + USERPID ACCESS COMMAND + /dev/snd/controlC1: james 3788 F pulseaudio + /dev/snd/controlC0: james 3788 F pulseaudio + james 10267 F alsamixer + Card0.Amixer.info: + Card hw:0 'Intel'/'HDA Intel at 0xf242 irq 45' +Mixer name : 'Conexant CX20585' +Components : 'HDA:14f15069,17aa214c,00100302 HDA:14f12c06,17aa2122,0010' +Controls : 9 +Simple ctrls : 6 + Card1.Amixer.info: + Card hw:1 'NVidia'/'HDA NVidia at 0xcdefc000 irq 16' +Mixer name : 'Nvidia GPU 0b HDMI/DP' +Components : 'HDA:10de000b,10de0101,00100100' +Controls : 24 +Simple ctrls : 4 + Card29.Amixer.info: + Card hw:29 'ThinkPadEC'/'ThinkPad Console Audio Control at EC reg 0x30, fw 6IHT37WW-1.12' +Mixer name : 'ThinkPad EC 6IHT37WW-1.12' +Components : '' +Controls : 1 +Simple ctrls : 1 + Card29.Amixer.values: + Simple mixer control 'Console',0 +Capabilities: pswitch pswitch-joined penum +Playback channels: Mono +Mono: Playback [on] + DistroRelease: Ubuntu 12.04 + HibernationDevice: RESUME=UUID=67e3cd44-242b-4bbf-918b-28fff81e0312 + InstallationMedia: Ubuntu 10.10 "Maverick Meerkat" - Release i386 (20101007) + MachineType: LENOVO 2516CTO + NonfreeKernelModules: nvidia + Package: linux (not installed) + ProcEnviron: + TERM=xterm + PATH=(custom, user) + LANG=fr_CA.UTF8 + SHELL=/bin/bash + ProcFB: 0 VESA VGA + ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-3.2.0-18-generic-pae root=UUID=7ad192e9-7b26-49d1-8e1c-fefc7dc495cb ro acpi_sleep=nonvs console=ttyUSB0,115200n8r console=tty quiet splash vt.handoff=7 + ProcVersionSignature: Ubuntu 3.2.0-18.29-generic-pae 3.2.9 + RelatedPackageVersions: + linux-restricted-modules-3.2.0-18-generic-pae N/A + linux-backports-modules-3.2.0-18-generic-pae N/A + linux-firmware1.71 + StagingDrivers: mei + Tags: precise staging + Uname: Linux 3.2.0-18-generic-pae i686 + UpgradeStatus: Upgraded to precise on 2012-01-12 (62 days ago) + UserGroups: adm admin cdrom dialout libvirtd lpadmin plugdev sambashare sbuild + dmi.bios.date: 08/27/2010 + dmi.bios.vendor: LENOVO + dmi.bios.version: 6IET72WW (1.32 ) + dmi.board.name: 2516CTO + dmi.board.vendor: LENOVO + dmi.board.version: Not Available + dmi.chassis.asset.tag: No Asset Information + dmi.chassis.type: 10 + dmi.chassis.vendor: LENOVO + dmi.chassis.version: Not Available + dmi.modalias: dmi:bvnLENOVO:bvr6IET72WW(1.32):bd08/27/2010:svnLENOVO:pn2516CTO:pvrThinkPadT410:rvnLENOVO:rn2516CTO:rvrNotAvailable:cvnLENOVO:ct10:cvrNotAvailable: + dmi.product.name: 2516CTO + dmi.product.version: Think
[Bug 955287] Re: Ubuntu should handle "hot" CPUs by taking preemptive action and warning users
It might also be worth while opening an upstream bug report[0] at bugzilla.kernel.org. [0] https://wiki.ubuntu.com/Bugs/Upstream/kernel ** Changed in: linux (Ubuntu) Importance: Undecided => Medium ** Tags added: kernel-da-key -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/955287 Title: Ubuntu should handle "hot" CPUs by taking preemptive action and warning users To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/955287/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 955287] Re: Ubuntu should handle "hot" CPUs by taking preemptive action and warning users
On my tp, we have: $ cat /sys/class/thermal/thermal_zone0/trip_point_* 10 critical 95500 passive $ ** Package changed: acpi-support (Ubuntu) => linux (Ubuntu) -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/955287 Title: Ubuntu should handle "hot" CPUs by taking preemptive action and warning users To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/955287/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 955287] Re: Ubuntu should handle "hot" CPUs by taking preemptive action and warning users
This probably should not be handled in the acpi-support package, since acpi-support is considered deprecated and we're trying (without much success) to phase it out. I think notifications would be better handled through upowerd. However, even without desktop notifications, the kernel is supposed to have built-in support for stepping down the CPU automatically when the system gets too hot. Perhaps the thresholds are wrong in your ACPI tables? For instance, my ACPI is buggy and has a "passive" threshold that's at a higher temperature than the "critical" threshold: $ cat /sys/class/thermal/thermal_zone0/trip_point_* 10 critical 127500 passive $ My understanding is that when these are correct, the kernel will take care of things automatically; and when they're wrong, as above, I'm not sure there's much point to working around it in userspace. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/955287 Title: Ubuntu should handle "hot" CPUs by taking preemptive action and warning users To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/acpi-support/+bug/955287/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 955287] Re: Ubuntu should handle "hot" CPUs by taking preemptive action and warning users
-- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/955287 Title: Ubuntu should handle "hot" CPUs by taking preemptive action and warning users To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/acpi-support/+bug/955287/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs