(In reply to raulvior.bcn from comment #635)
ASUSTeK COMPUTER INC. CROSSHAIR VI HERO 7403 08/20/2019
AMD Ryzen 7 1800X Eight-Core Processor
16410MB
2560x1440 pixels
Radeon RX 580 Series (POLARIS10, DRM 3.27.0, 5.0.0-25-generic, LLVM 8.0.0)
Linux 5.0.0-25-generic (x86_64) #26-Ubuntu SMP Thu Aug 1 12:04:58 UTC 2019
GNU C library / (Ubuntu GLIBC 2.29-0ubuntu2) 2.29
Ubuntu 19.04
BOOT_IMAGE=/vmlinuz-5.0.0-25-generic 
root=UUID=XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX ro quiet splash vt.handoff=1


With the latest BIOS (which provides AGESA 1.0.0.3ABB), Power Supply Idle 
option did work. This option increases core voltage. Knowing that increasing 
core voltage stabilized the system for more than 48 hours, I suspected the 
"performance" governor could be problematic. Because even though it made the 
CPU to operate at higher frequencies, the voltage was not increased. Values of 
0.35, 0.5 or 0.85V were still showing up despite the minimum 3 GHz frequency 
instead of 2 GHz.

I have disabled "power supply idle" UEFI option and started to use the
"ondemand" governor, which has a lower minimum frequency of 2 GHz.

The system continues to be stable. This is the longest time I have had
the system without reboots.

Linux reports the following bugs firmware bugs:
[    0.065489] ACPI: [Firmware Bug]: BIOS _OSI(Linux) query ignored
[    0.586638] [Firmware Bug]: ACPI MWAIT C-state 0x0 not supported by HW (0x0)
[    0.586749] [Firmware Bug]: ACPI MWAIT C-state 0x0 not supported by HW (0x0)
[    0.586830] [Firmware Bug]: ACPI MWAIT C-state 0x0 not supported by HW (0x0)
[    0.586920] [Firmware Bug]: ACPI MWAIT C-state 0x0 not supported by HW (0x0)
[    0.586983] [Firmware Bug]: ACPI MWAIT C-state 0x0 not supported by HW (0x0)
[    0.587042] [Firmware Bug]: ACPI MWAIT C-state 0x0 not supported by HW (0x0)
[    0.587110] [Firmware Bug]: ACPI MWAIT C-state 0x0 not supported by HW (0x0)
[    0.587163] [Firmware Bug]: ACPI MWAIT C-state 0x0 not supported by HW (0x0)
[    0.587235] [Firmware Bug]: ACPI MWAIT C-state 0x0 not supported by HW (0x0)
[    0.587309] [Firmware Bug]: ACPI MWAIT C-state 0x0 not supported by HW (0x0)
[    0.587378] [Firmware Bug]: ACPI MWAIT C-state 0x0 not supported by HW (0x0)
[    0.587448] [Firmware Bug]: ACPI MWAIT C-state 0x0 not supported by HW (0x0)
[    0.587522] [Firmware Bug]: ACPI MWAIT C-state 0x0 not supported by HW (0x0)
[    0.587590] [Firmware Bug]: ACPI MWAIT C-state 0x0 not supported by HW (0x0)
[    0.587646] [Firmware Bug]: ACPI MWAIT C-state 0x0 not supported by HW (0x0)
[    0.587699] [Firmware Bug]: ACPI MWAIT C-state 0x0 not supported by HW (0x0)

Which effectively turns off MWAIT. cpupower idle-info output:

CPUidle driver: acpi_idle
CPUidle governor: menu
analyzing CPU 0:

Number of idle states: 3
Available idle states: POLL C1 C2
POLL:
Flags/Description: CPUIDLE CORE POLL IDLE
Latency: 0
Usage: 42208
Duration: 98245
C1:
Flags/Description: ACPI HLT
Latency: 1
Usage: 2147693
Duration: 605512928
C2:
Flags/Description: ACPI IOPORT 0x414
Latency: 400
Usage: 4338245
Duration: 71750833391


For my problem, which might not be the same as for the rest of users here, not 
having MWAIT did not stop the system to reboot randomly. And despite the kernel 
not detecting MWAIT, the processor still boosts up to 4.1 GHz as intended. C6 
States are enabled. The CPU might enter them after entering C1 with HLT. 
Zenstates.py output:

P0 - Enabled - FID = 90 - DID = 8 - VID = 20 - Ratio = 36.00 - vCore = 1.35000
P1 - Enabled - FID = 80 - DID = 8 - VID = 2C - Ratio = 32.00 - vCore = 1.27500
P2 - Enabled - FID = 84 - DID = C - VID = 68 - Ratio = 22.00 - vCore = 0.90000
P3 - Disabled
P4 - Disabled
P5 - Disabled
P6 - Disabled
P7 - Disabled
C6 State - Package - Enabled
C6 State - Core - Enabled

I'll report in the future whether if I have had another reboot. I'm
nonitoring the system with the Vitals GNOME extension
(https://extensions.gnome.org/extension/1460/vitals/). The kernel module
asus-wmi-sensors provide the actual data (https://github.com/electrified
/asus-wmi-sensors).

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1690085

Title:
  Ryzen 1800X freeze - rcu_sched detected stalls on CPUs/tasks

Status in Linux:
  Confirmed
Status in linux package in Ubuntu:
  Confirmed

Bug description:
  Hi,

  
  We aregetting various kernel crash on a pretty new config.
  We're using Ryzen 1800X CPU with X370 Gaming Pro Carbon MB (7A32V1) using 
latest BIOS available (1.52)

  We are running Ubuntu 17.04 (amd64), we've tried different kernel version, 
native one and releases from http://kernel.ubuntu.com/~kernel-ppa/mainline/ too.
  Tested kernel version:

  native 17.04 kernel
  4.10.15

  Issues are the same, we're getting random freeze on the machine.

  Here is kern.log entry when happening :

  May 10 22:41:56 dev2 kernel: [24366.186246] INFO: rcu_sched detected stalls 
on CPUs/tasks:
  May 10 22:41:56 dev2 kernel: [24366.187618]     0-...: (1 GPs behind) 
idle=49b/1/0 softirq=28561/28563 fqs=913449
  May 10 22:41:56 dev2 kernel: [24366.188977]     (detected by 12, t=1860207 
jiffies, g=10001, c=10000, q=4656)
  May 10 22:41:56 dev2 kernel: [24366.190344] Task dump for CPU 0:
  May 10 22:41:56 dev2 kernel: [24366.190345] swapper/0       R  running task   
     0     0      0 0x00000008
  May 10 22:41:56 dev2 kernel: [24366.190348] Call Trace:
  May 10 22:41:56 dev2 kernel: [24366.190354]  ? native_safe_halt+0x6/0x10
  May 10 22:41:56 dev2 kernel: [24366.190355]  ? default_idle+0x20/0xd0
  May 10 22:41:56 dev2 kernel: [24366.190358]  ? arch_cpu_idle+0xf/0x20
  May 10 22:41:56 dev2 kernel: [24366.190360]  ? default_idle_call+0x23/0x30
  May 10 22:41:56 dev2 kernel: [24366.190362]  ? do_idle+0x16f/0x200
  May 10 22:41:56 dev2 kernel: [24366.190364]  ? cpu_startup_entry+0x71/0x80
  May 10 22:41:56 dev2 kernel: [24366.190366]  ? rest_init+0x77/0x80
  May 10 22:41:56 dev2 kernel: [24366.190368]  ? start_kernel+0x464/0x485
  May 10 22:41:56 dev2 kernel: [24366.190369]  ? 
early_idt_handler_array+0x120/0x120
  May 10 22:41:56 dev2 kernel: [24366.190371]  ? 
x86_64_start_reservations+0x24/0x26
  May 10 22:41:56 dev2 kernel: [24366.190372]  ? x86_64_start_kernel+0x14d/0x170
  May 10 22:41:56 dev2 kernel: [24366.190373]  ? start_cpu+0x14/0x14
  May 10 22:44:56 dev2 kernel: [24546.188093] INFO: rcu_sched detected stalls 
on CPUs/tasks:
  May 10 22:44:56 dev2 kernel: [24546.189461]     0-...: (1 GPs behind) 
idle=49b/1/0 softirq=28561/28563 fqs=935027
  May 10 22:44:56 dev2 kernel: [24546.190823]     (detected by 14, t=1905212 
jiffies, g=10001, c=10000, q=4740)
  May 10 22:44:56 dev2 kernel: [24546.192191] Task dump for CPU 0:
  May 10 22:44:56 dev2 kernel: [24546.192192] swapper/0       R  running task   
     0     0      0 0x00000008
  May 10 22:44:56 dev2 kernel: [24546.192195] Call Trace:
  May 10 22:44:56 dev2 kernel: [24546.192199]  ? native_safe_halt+0x6/0x10
  May 10 22:44:56 dev2 kernel: [24546.192201]  ? default_idle+0x20/0xd0
  May 10 22:44:56 dev2 kernel: [24546.192203]  ? arch_cpu_idle+0xf/0x20
  May 10 22:44:56 dev2 kernel: [24546.192204]  ? default_idle_call+0x23/0x30
  May 10 22:44:56 dev2 kernel: [24546.192206]  ? do_idle+0x16f/0x200
  May 10 22:44:56 dev2 kernel: [24546.192208]  ? cpu_startup_entry+0x71/0x80
  May 10 22:44:56 dev2 kernel: [24546.192210]  ? rest_init+0x77/0x80
  May 10 22:44:56 dev2 kernel: [24546.192211]  ? start_kernel+0x464/0x485
  May 10 22:44:56 dev2 kernel: [24546.192213]  ? 
early_idt_handler_array+0x120/0x120
  May 10 22:44:56 dev2 kernel: [24546.192214]  ? 
x86_64_start_reservations+0x24/0x26
  May 10 22:44:56 dev2 kernel: [24546.192215]  ? x86_64_start_kernel+0x14d/0x170
  May 10 22:44:56 dev2 kernel: [24546.192217]  ? start_cpu+0x14/0x14

  Depending on the kernel version, we've got NMI watchdog errors related to CPU 
stuck (mentioning the CPU core id, which is random).
  Crash is happening randomly, but in general after some hours (3-4h).

  Now, we've installed kernel 4.11.0-041100-generic #201705041534 this morning 
and waiting for crash...
  For now, the machine is not "used", at least, it's not CPU stressed...

  
  Thanks
  --- 
  ApportVersion: 2.20.4-0ubuntu4
  Architecture: amd64
  DistroRelease: Ubuntu 17.04
  InstallationDate: Installed on 2017-05-09 (1 days ago)
  InstallationMedia: Ubuntu-Server 17.04 "Zesty Zapus" - Release amd64 
(20170412)
  Package: linux (not installed)
  ProcEnviron:
   TERM=xterm-256color
   PATH=(custom, no user)
   XDG_RUNTIME_DIR=<set>
   LANG=fr_FR.UTF-8
   SHELL=/bin/bash
  Tags:  zesty
  Uname: Linux 4.11.0-041100-generic x86_64
  UnreportableReason: The running kernel is not an Ubuntu kernel
  UpgradeStatus: No upgrade log present (probably fresh install)
  UserGroups:
   
  _MarkForUpload: True

To manage notifications about this bug go to:
https://bugs.launchpad.net/linux/+bug/1690085/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to