Public bug reported:

I am running Ubuntu 23.10, and using an ASUS WRX90E-SAGE motherboard.

My CPU (AMD 7965WX) is not properly thermally throttling, and is
exceeding the max operating temperature of 95C without throttling (it
has gotten as high as 98 before I shut it off).

If I run CPU intensive processes that max out all cores (such as stress-
ng), I can quickly exceed the maximum temp within less than a minute,
and Ubuntu does nothing to throttle the CPU. I'm worried that my CPU is
going to get damaged.

When I look at
/sys/devices/system/cpu/cpu0/cpufreq/scaling_available_governors there
are only "powersave" and "performance" available ("powersave" is what it
is currently set to).

If I try to set to "performance" governor in /etc/init.d/cpufrequtils
and reboot, it has no effect. It just stays in "powersave" mode. (I
don't know that this would help anyway as far as throttling, but I don't
know what else to try)

When I run `systemctl status --lines=50 thermald` I get the following
output:

========

○ thermald.service - Thermal Daemon Service
     Loaded: loaded (/lib/systemd/system/thermald.service; enabled; preset: 
enabled)
     Active: inactive (dead) since Mon 2024-04-22 14:34:09 PDT; 26min ago
    Process: 1878 ExecStart=/usr/sbin/thermald --systemd --dbus-enable 
--adaptive (code=exited, status=0/SUCCESS)
   Main PID: 1878 (code=exited, status=0/SUCCESS)
        CPU: 11ms

Apr 22 14:34:09 ML-tower systemd[1]: Starting thermald.service - Thermal Daemon 
Service...
Apr 22 14:34:09 ML-tower thermald[1878]: Unsupported cpu model or platform
Apr 22 14:34:09 ML-tower systemd[1]: thermald.service: Deactivated successfully.
Apr 22 14:34:09 ML-tower systemd[1]: Started thermald.service - Thermal Daemon 
Service.

=======

Note that it says "Unsupported cpu model or platform". The AMD 7965WX is
definitely supported by this motherboard, so I'm not sure what is going
on.

It does seem to be able to read the CPU thermal sensor, because when I
run `sensors` I get the following output:

k10temp-pci-00c3
Adapter: PCI adapter
Tctl:         +46.9°C
Tccd1:        +40.4°C
Tccd2:        +40.6°C
Tccd3:        +40.1°C
Tccd4:        +39.6°C

... but I don't know how accurate the readings are, and am concerned
especially since it doesn't appear tyo do anything when max temperature
exceeded.

How can I ensure that my CPU will not exceed a specified temperature
threshold (ideally ~93C or less)?

Please let me know if there is any other information that I could
provide to help debug this issue.

ProblemType: Bug
DistroRelease: Ubuntu 23.10
Package: thermald 2.5.4-2
ProcVersionSignature: Ubuntu 6.5.0-28.29-generic 6.5.13
Uname: Linux 6.5.0-28-generic x86_64
NonfreeKernelModules: nvidia_modeset nvidia
ApportVersion: 2.27.0-0ubuntu5
Architecture: amd64
CasperMD5CheckResult: unknown
CurrentDesktop: KDE
Date: Mon Apr 22 17:36:51 2024
InstallationDate: Installed on 2024-04-08 (15 days ago)
InstallationMedia: Kubuntu 23.10 "Mantic Minotaur" - Release amd64 (20231010)
SourcePackage: thermald
UpgradeStatus: No upgrade log present (probably fresh install)
modified.conffile..etc.init.thermald.conf: [deleted]
mtime.conffile..etc.thermald.thermal-cpu-cdev-order.xml: 2023-08-25T03:29:11

** Affects: thermald (Ubuntu)
     Importance: Undecided
         Status: New


** Tags: amd64 apport-bug mantic

** Description changed:

  I am running Ubuntu 23.10, and using an ASUS WRX90E-SAGE motherboard.
  
  My CPU (AMD 7965WX) is not properly thermally throttling, and is
  exceeding the max operating temperature of 95C without throttling (it
  has gotten as high as 98 before I shut it off).
  
  If I run CPU intensive processes that max out all cores (such as stress-
  ng), I can quickly exceed the maximum temp within less than a minute,
  and Ubuntu does nothing to throttle the CPU. I'm worried that my CPU is
  going to get damaged.
  
  When I look at
  /sys/devices/system/cpu/cpu0/cpufreq/scaling_available_governors there
  are only "powersave" and "performance" available ("powersave" is what it
  is currently set to).
  
  If I try to set to "performance" governor in /etc/init.d/cpufrequtils
- and reboot, it has no effect. It just stays in "powersave" mode. I don't
- know that this would help anyway as far as throttling, but I don't know
- what else to try.
+ and reboot, it has no effect. It just stays in "powersave" mode. (I
+ don't know that this would help anyway as far as throttling, but I don't
+ know what else to try)
  
  When I run `systemctl status --lines=50 thermald` I get the following
  output:
  
- 
  ========
  
- ○ thermald.service - Thermal Daemon Service 
-      Loaded: loaded (/lib/systemd/system/thermald.service; enabled; preset: 
enabled) 
-      Active: inactive (dead) since Mon 2024-04-22 14:34:09 PDT; 26min ago 
-     Process: 1878 ExecStart=/usr/sbin/thermald --systemd --dbus-enable 
--adaptive (code=exited, status=0/SUCCESS) 
-    Main PID: 1878 (code=exited, status=0/SUCCESS) 
-         CPU: 11ms 
+ ○ thermald.service - Thermal Daemon Service
+      Loaded: loaded (/lib/systemd/system/thermald.service; enabled; preset: 
enabled)
+      Active: inactive (dead) since Mon 2024-04-22 14:34:09 PDT; 26min ago
+     Process: 1878 ExecStart=/usr/sbin/thermald --systemd --dbus-enable 
--adaptive (code=exited, status=0/SUCCESS)
+    Main PID: 1878 (code=exited, status=0/SUCCESS)
+         CPU: 11ms
  
- Apr 22 14:34:09 ML-tower systemd[1]: Starting thermald.service - Thermal 
Daemon Service... 
+ Apr 22 14:34:09 ML-tower systemd[1]: Starting thermald.service - Thermal 
Daemon Service...
  Apr 22 14:34:09 ML-tower thermald[1878]: Unsupported cpu model or platform
- Apr 22 14:34:09 ML-tower systemd[1]: thermald.service: Deactivated 
successfully. 
+ Apr 22 14:34:09 ML-tower systemd[1]: thermald.service: Deactivated 
successfully.
  Apr 22 14:34:09 ML-tower systemd[1]: Started thermald.service - Thermal 
Daemon Service.
  
  =======
  
  Note that it says "Unsupported cpu model or platform". The AMD 7965WX is
  definitely supported by this motherboard, so I'm not sure what is going
  on.
  
  It does seem to be able to read the CPU thermal sensor, because when I
  run `sensors` I get the following output:
  
  k10temp-pci-00c3
  Adapter: PCI adapter
- Tctl:         +46.9°C  
- Tccd1:        +40.4°C  
- Tccd2:        +40.6°C  
- Tccd3:        +40.1°C  
- Tccd4:        +39.6°C  
+ Tctl:         +46.9°C
+ Tccd1:        +40.4°C
+ Tccd2:        +40.6°C
+ Tccd3:        +40.1°C
+ Tccd4:        +39.6°C
  
  ... but I don't know how accurate the readings are, and am concerned
  especially since it doesn't appear tyo do anything when max temperature
  exceeded.
  
- 
- How can I ensure that my CPU will not exceed a specified temperature 
threshold (ideally ~93C or less)? 
+ How can I ensure that my CPU will not exceed a specified temperature
+ threshold (ideally ~93C or less)?
  
  Please let me know if there is any other information that I could
  provide to help debug this issue.
  
  ProblemType: Bug
  DistroRelease: Ubuntu 23.10
  Package: thermald 2.5.4-2
  ProcVersionSignature: Ubuntu 6.5.0-28.29-generic 6.5.13
  Uname: Linux 6.5.0-28-generic x86_64
  NonfreeKernelModules: nvidia_modeset nvidia
  ApportVersion: 2.27.0-0ubuntu5
  Architecture: amd64
  CasperMD5CheckResult: unknown
  CurrentDesktop: KDE
  Date: Mon Apr 22 17:36:51 2024
  InstallationDate: Installed on 2024-04-08 (15 days ago)
  InstallationMedia: Kubuntu 23.10 "Mantic Minotaur" - Release amd64 (20231010)
  SourcePackage: thermald
  UpgradeStatus: No upgrade log present (probably fresh install)
  modified.conffile..etc.init.thermald.conf: [deleted]
  mtime.conffile..etc.thermald.thermal-cpu-cdev-order.xml: 2023-08-25T03:29:11

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/2063165

Title:
  CPU not thermal throttling at max temp (AMD 7965WX on WRX90E-SAGE
  motherboard)

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/thermald/+bug/2063165/+subscriptions


-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to