[Kernel-packages] [Bug 1690085]
I have a couple of Threadrippers (1950X) that also hang under very specific workloads with a Asus X-399-A motherboard. In my case when the freeze happens the only way to get the machine back is to unplug it from the wall and plug it back. Not even reset works. After many tweaks, I found out that I can avoid that but setting either the "overclocking enhancement" on in the BIOS (this is specific to Asus boards) our manually setting the Vcore to something like 1.315V (which should work on all motherboards). My guess is than when in idle the motherboard drops the voltage too low and the CPU hangs. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1690085 Title: Ryzen 1800X freeze - rcu_sched detected stalls on CPUs/tasks Status in Linux: Confirmed Status in linux package in Ubuntu: Confirmed Bug description: Hi, We aregetting various kernel crash on a pretty new config. We're using Ryzen 1800X CPU with X370 Gaming Pro Carbon MB (7A32V1) using latest BIOS available (1.52) We are running Ubuntu 17.04 (amd64), we've tried different kernel version, native one and releases from http://kernel.ubuntu.com/~kernel-ppa/mainline/ too. Tested kernel version: native 17.04 kernel 4.10.15 Issues are the same, we're getting random freeze on the machine. Here is kern.log entry when happening : May 10 22:41:56 dev2 kernel: [24366.186246] INFO: rcu_sched detected stalls on CPUs/tasks: May 10 22:41:56 dev2 kernel: [24366.187618] 0-...: (1 GPs behind) idle=49b/1/0 softirq=28561/28563 fqs=913449 May 10 22:41:56 dev2 kernel: [24366.188977] (detected by 12, t=1860207 jiffies, g=10001, c=1, q=4656) May 10 22:41:56 dev2 kernel: [24366.190344] Task dump for CPU 0: May 10 22:41:56 dev2 kernel: [24366.190345] swapper/0 R running task 0 0 0 0x0008 May 10 22:41:56 dev2 kernel: [24366.190348] Call Trace: May 10 22:41:56 dev2 kernel: [24366.190354] ? native_safe_halt+0x6/0x10 May 10 22:41:56 dev2 kernel: [24366.190355] ? default_idle+0x20/0xd0 May 10 22:41:56 dev2 kernel: [24366.190358] ? arch_cpu_idle+0xf/0x20 May 10 22:41:56 dev2 kernel: [24366.190360] ? default_idle_call+0x23/0x30 May 10 22:41:56 dev2 kernel: [24366.190362] ? do_idle+0x16f/0x200 May 10 22:41:56 dev2 kernel: [24366.190364] ? cpu_startup_entry+0x71/0x80 May 10 22:41:56 dev2 kernel: [24366.190366] ? rest_init+0x77/0x80 May 10 22:41:56 dev2 kernel: [24366.190368] ? start_kernel+0x464/0x485 May 10 22:41:56 dev2 kernel: [24366.190369] ? early_idt_handler_array+0x120/0x120 May 10 22:41:56 dev2 kernel: [24366.190371] ? x86_64_start_reservations+0x24/0x26 May 10 22:41:56 dev2 kernel: [24366.190372] ? x86_64_start_kernel+0x14d/0x170 May 10 22:41:56 dev2 kernel: [24366.190373] ? start_cpu+0x14/0x14 May 10 22:44:56 dev2 kernel: [24546.188093] INFO: rcu_sched detected stalls on CPUs/tasks: May 10 22:44:56 dev2 kernel: [24546.189461] 0-...: (1 GPs behind) idle=49b/1/0 softirq=28561/28563 fqs=935027 May 10 22:44:56 dev2 kernel: [24546.190823] (detected by 14, t=1905212 jiffies, g=10001, c=1, q=4740) May 10 22:44:56 dev2 kernel: [24546.192191] Task dump for CPU 0: May 10 22:44:56 dev2 kernel: [24546.192192] swapper/0 R running task 0 0 0 0x0008 May 10 22:44:56 dev2 kernel: [24546.192195] Call Trace: May 10 22:44:56 dev2 kernel: [24546.192199] ? native_safe_halt+0x6/0x10 May 10 22:44:56 dev2 kernel: [24546.192201] ? default_idle+0x20/0xd0 May 10 22:44:56 dev2 kernel: [24546.192203] ? arch_cpu_idle+0xf/0x20 May 10 22:44:56 dev2 kernel: [24546.192204] ? default_idle_call+0x23/0x30 May 10 22:44:56 dev2 kernel: [24546.192206] ? do_idle+0x16f/0x200 May 10 22:44:56 dev2 kernel: [24546.192208] ? cpu_startup_entry+0x71/0x80 May 10 22:44:56 dev2 kernel: [24546.192210] ? rest_init+0x77/0x80 May 10 22:44:56 dev2 kernel: [24546.192211] ? start_kernel+0x464/0x485 May 10 22:44:56 dev2 kernel: [24546.192213] ? early_idt_handler_array+0x120/0x120 May 10 22:44:56 dev2 kernel: [24546.192214] ? x86_64_start_reservations+0x24/0x26 May 10 22:44:56 dev2 kernel: [24546.192215] ? x86_64_start_kernel+0x14d/0x170 May 10 22:44:56 dev2 kernel: [24546.192217] ? start_cpu+0x14/0x14 Depending on the kernel version, we've got NMI watchdog errors related to CPU stuck (mentioning the CPU core id, which is random). Crash is happening randomly, but in general after some hours (3-4h). Now, we've installed kernel 4.11.0-041100-generic #201705041534 this morning and waiting for crash... For now, the machine is not "used", at least, it's not CPU stressed... Thanks --- ApportVersion: 2.20.4-0ubuntu4 Architecture: amd64 DistroRelease: Ubuntu 17.04 InstallationDate: Installed on 2017-05-09 (1 days ago) InstallationMedia: Ubuntu-Server 17.04 "Zesty Zapus" - Release amd64 (201
[Kernel-packages] [Bug 1690085]
>From my experience (a Ryzen 1700X on a MSI B350 motherboard) the freezes on IDLE, like the ones described from Trevor Cordes, went way once I changed the "Idle: typical" setting on the BIOS. I had the bug before the BIOS setting appeared, and tried many things with more or less success. But the best solution by far is just to change the BIOS setting. I have many months of uptime without any lock up since then. *It should be the first thing to test if your computer freezes by itself on idle*. On the other hand I have a Threadripper that freezes when doing hard computations (that alternate between high CPU usage and low, because it needs to write temporary results to disk). This can not be solved using the "Idle: typical" BIOS option. This other freeze happens specially when using Matlab. I am starting to think that the problem is that Matlab (or the MKL Blas libraryit uses) might use heavily SSE3 instructions that have the MWAIT instruction as cited by Francesco Buscani above. I should try the kernel option "idle=mnowait" to see if I can get rid of these freezes. Unfortunately I am travelling until the end of the month and away from the Threadrippers to test it. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1690085 Title: Ryzen 1800X freeze - rcu_sched detected stalls on CPUs/tasks Status in Linux: Confirmed Status in linux package in Ubuntu: Confirmed Bug description: Hi, We aregetting various kernel crash on a pretty new config. We're using Ryzen 1800X CPU with X370 Gaming Pro Carbon MB (7A32V1) using latest BIOS available (1.52) We are running Ubuntu 17.04 (amd64), we've tried different kernel version, native one and releases from http://kernel.ubuntu.com/~kernel-ppa/mainline/ too. Tested kernel version: native 17.04 kernel 4.10.15 Issues are the same, we're getting random freeze on the machine. Here is kern.log entry when happening : May 10 22:41:56 dev2 kernel: [24366.186246] INFO: rcu_sched detected stalls on CPUs/tasks: May 10 22:41:56 dev2 kernel: [24366.187618] 0-...: (1 GPs behind) idle=49b/1/0 softirq=28561/28563 fqs=913449 May 10 22:41:56 dev2 kernel: [24366.188977] (detected by 12, t=1860207 jiffies, g=10001, c=1, q=4656) May 10 22:41:56 dev2 kernel: [24366.190344] Task dump for CPU 0: May 10 22:41:56 dev2 kernel: [24366.190345] swapper/0 R running task 0 0 0 0x0008 May 10 22:41:56 dev2 kernel: [24366.190348] Call Trace: May 10 22:41:56 dev2 kernel: [24366.190354] ? native_safe_halt+0x6/0x10 May 10 22:41:56 dev2 kernel: [24366.190355] ? default_idle+0x20/0xd0 May 10 22:41:56 dev2 kernel: [24366.190358] ? arch_cpu_idle+0xf/0x20 May 10 22:41:56 dev2 kernel: [24366.190360] ? default_idle_call+0x23/0x30 May 10 22:41:56 dev2 kernel: [24366.190362] ? do_idle+0x16f/0x200 May 10 22:41:56 dev2 kernel: [24366.190364] ? cpu_startup_entry+0x71/0x80 May 10 22:41:56 dev2 kernel: [24366.190366] ? rest_init+0x77/0x80 May 10 22:41:56 dev2 kernel: [24366.190368] ? start_kernel+0x464/0x485 May 10 22:41:56 dev2 kernel: [24366.190369] ? early_idt_handler_array+0x120/0x120 May 10 22:41:56 dev2 kernel: [24366.190371] ? x86_64_start_reservations+0x24/0x26 May 10 22:41:56 dev2 kernel: [24366.190372] ? x86_64_start_kernel+0x14d/0x170 May 10 22:41:56 dev2 kernel: [24366.190373] ? start_cpu+0x14/0x14 May 10 22:44:56 dev2 kernel: [24546.188093] INFO: rcu_sched detected stalls on CPUs/tasks: May 10 22:44:56 dev2 kernel: [24546.189461] 0-...: (1 GPs behind) idle=49b/1/0 softirq=28561/28563 fqs=935027 May 10 22:44:56 dev2 kernel: [24546.190823] (detected by 14, t=1905212 jiffies, g=10001, c=1, q=4740) May 10 22:44:56 dev2 kernel: [24546.192191] Task dump for CPU 0: May 10 22:44:56 dev2 kernel: [24546.192192] swapper/0 R running task 0 0 0 0x0008 May 10 22:44:56 dev2 kernel: [24546.192195] Call Trace: May 10 22:44:56 dev2 kernel: [24546.192199] ? native_safe_halt+0x6/0x10 May 10 22:44:56 dev2 kernel: [24546.192201] ? default_idle+0x20/0xd0 May 10 22:44:56 dev2 kernel: [24546.192203] ? arch_cpu_idle+0xf/0x20 May 10 22:44:56 dev2 kernel: [24546.192204] ? default_idle_call+0x23/0x30 May 10 22:44:56 dev2 kernel: [24546.192206] ? do_idle+0x16f/0x200 May 10 22:44:56 dev2 kernel: [24546.192208] ? cpu_startup_entry+0x71/0x80 May 10 22:44:56 dev2 kernel: [24546.192210] ? rest_init+0x77/0x80 May 10 22:44:56 dev2 kernel: [24546.192211] ? start_kernel+0x464/0x485 May 10 22:44:56 dev2 kernel: [24546.192213] ? early_idt_handler_array+0x120/0x120 May 10 22:44:56 dev2 kernel: [24546.192214] ? x86_64_start_reservations+0x24/0x26 May 10 22:44:56 dev2 kernel: [24546.192215] ? x86_64_start_kernel+0x14d/0x170 May 10 22:44:56 dev2 kernel: [24546.192217] ? start_cpu+0x14/0x14 Depending on the kernel version, we've
[Kernel-packages] [Bug 1690085]
Hi, a follow up. My two Threadripper machines are working flawlessly since Friday. I tried both, running a test that alternates between high load and idle that would normally trigger the but, and leave the machine idle for many hours. It seems like a possible workaround for someone with a ASUS X399-A Prime motherboard is to set the BIOS option "Overclocking Enhancement". Beware that this option has a weird side-effect of skewing the temperature to report a lower value. Therefore you should have a good cooling solution in place. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1690085 Title: Ryzen 1800X freeze - rcu_sched detected stalls on CPUs/tasks Status in Linux: Confirmed Status in linux package in Ubuntu: Confirmed Bug description: Hi, We aregetting various kernel crash on a pretty new config. We're using Ryzen 1800X CPU with X370 Gaming Pro Carbon MB (7A32V1) using latest BIOS available (1.52) We are running Ubuntu 17.04 (amd64), we've tried different kernel version, native one and releases from http://kernel.ubuntu.com/~kernel-ppa/mainline/ too. Tested kernel version: native 17.04 kernel 4.10.15 Issues are the same, we're getting random freeze on the machine. Here is kern.log entry when happening : May 10 22:41:56 dev2 kernel: [24366.186246] INFO: rcu_sched detected stalls on CPUs/tasks: May 10 22:41:56 dev2 kernel: [24366.187618] 0-...: (1 GPs behind) idle=49b/1/0 softirq=28561/28563 fqs=913449 May 10 22:41:56 dev2 kernel: [24366.188977] (detected by 12, t=1860207 jiffies, g=10001, c=1, q=4656) May 10 22:41:56 dev2 kernel: [24366.190344] Task dump for CPU 0: May 10 22:41:56 dev2 kernel: [24366.190345] swapper/0 R running task 0 0 0 0x0008 May 10 22:41:56 dev2 kernel: [24366.190348] Call Trace: May 10 22:41:56 dev2 kernel: [24366.190354] ? native_safe_halt+0x6/0x10 May 10 22:41:56 dev2 kernel: [24366.190355] ? default_idle+0x20/0xd0 May 10 22:41:56 dev2 kernel: [24366.190358] ? arch_cpu_idle+0xf/0x20 May 10 22:41:56 dev2 kernel: [24366.190360] ? default_idle_call+0x23/0x30 May 10 22:41:56 dev2 kernel: [24366.190362] ? do_idle+0x16f/0x200 May 10 22:41:56 dev2 kernel: [24366.190364] ? cpu_startup_entry+0x71/0x80 May 10 22:41:56 dev2 kernel: [24366.190366] ? rest_init+0x77/0x80 May 10 22:41:56 dev2 kernel: [24366.190368] ? start_kernel+0x464/0x485 May 10 22:41:56 dev2 kernel: [24366.190369] ? early_idt_handler_array+0x120/0x120 May 10 22:41:56 dev2 kernel: [24366.190371] ? x86_64_start_reservations+0x24/0x26 May 10 22:41:56 dev2 kernel: [24366.190372] ? x86_64_start_kernel+0x14d/0x170 May 10 22:41:56 dev2 kernel: [24366.190373] ? start_cpu+0x14/0x14 May 10 22:44:56 dev2 kernel: [24546.188093] INFO: rcu_sched detected stalls on CPUs/tasks: May 10 22:44:56 dev2 kernel: [24546.189461] 0-...: (1 GPs behind) idle=49b/1/0 softirq=28561/28563 fqs=935027 May 10 22:44:56 dev2 kernel: [24546.190823] (detected by 14, t=1905212 jiffies, g=10001, c=1, q=4740) May 10 22:44:56 dev2 kernel: [24546.192191] Task dump for CPU 0: May 10 22:44:56 dev2 kernel: [24546.192192] swapper/0 R running task 0 0 0 0x0008 May 10 22:44:56 dev2 kernel: [24546.192195] Call Trace: May 10 22:44:56 dev2 kernel: [24546.192199] ? native_safe_halt+0x6/0x10 May 10 22:44:56 dev2 kernel: [24546.192201] ? default_idle+0x20/0xd0 May 10 22:44:56 dev2 kernel: [24546.192203] ? arch_cpu_idle+0xf/0x20 May 10 22:44:56 dev2 kernel: [24546.192204] ? default_idle_call+0x23/0x30 May 10 22:44:56 dev2 kernel: [24546.192206] ? do_idle+0x16f/0x200 May 10 22:44:56 dev2 kernel: [24546.192208] ? cpu_startup_entry+0x71/0x80 May 10 22:44:56 dev2 kernel: [24546.192210] ? rest_init+0x77/0x80 May 10 22:44:56 dev2 kernel: [24546.192211] ? start_kernel+0x464/0x485 May 10 22:44:56 dev2 kernel: [24546.192213] ? early_idt_handler_array+0x120/0x120 May 10 22:44:56 dev2 kernel: [24546.192214] ? x86_64_start_reservations+0x24/0x26 May 10 22:44:56 dev2 kernel: [24546.192215] ? x86_64_start_kernel+0x14d/0x170 May 10 22:44:56 dev2 kernel: [24546.192217] ? start_cpu+0x14/0x14 Depending on the kernel version, we've got NMI watchdog errors related to CPU stuck (mentioning the CPU core id, which is random). Crash is happening randomly, but in general after some hours (3-4h). Now, we've installed kernel 4.11.0-041100-generic #201705041534 this morning and waiting for crash... For now, the machine is not "used", at least, it's not CPU stressed... Thanks --- ApportVersion: 2.20.4-0ubuntu4 Architecture: amd64 DistroRelease: Ubuntu 17.04 InstallationDate: Installed on 2017-05-09 (1 days ago) InstallationMedia: Ubuntu-Server 17.04 "Zesty Zapus" - Release amd64 (20170412) Package: linux (not installed) ProcEnviron: TERM=xter
[Kernel-packages] [Bug 1690085]
Oh, and I forgot to mention. Both machines are using the "Overclocking Enhancement" options and I have explicitly set c6 state to enabled using zenstates.py. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1690085 Title: Ryzen 1800X freeze - rcu_sched detected stalls on CPUs/tasks Status in Linux: Confirmed Status in linux package in Ubuntu: Confirmed Bug description: Hi, We aregetting various kernel crash on a pretty new config. We're using Ryzen 1800X CPU with X370 Gaming Pro Carbon MB (7A32V1) using latest BIOS available (1.52) We are running Ubuntu 17.04 (amd64), we've tried different kernel version, native one and releases from http://kernel.ubuntu.com/~kernel-ppa/mainline/ too. Tested kernel version: native 17.04 kernel 4.10.15 Issues are the same, we're getting random freeze on the machine. Here is kern.log entry when happening : May 10 22:41:56 dev2 kernel: [24366.186246] INFO: rcu_sched detected stalls on CPUs/tasks: May 10 22:41:56 dev2 kernel: [24366.187618] 0-...: (1 GPs behind) idle=49b/1/0 softirq=28561/28563 fqs=913449 May 10 22:41:56 dev2 kernel: [24366.188977] (detected by 12, t=1860207 jiffies, g=10001, c=1, q=4656) May 10 22:41:56 dev2 kernel: [24366.190344] Task dump for CPU 0: May 10 22:41:56 dev2 kernel: [24366.190345] swapper/0 R running task 0 0 0 0x0008 May 10 22:41:56 dev2 kernel: [24366.190348] Call Trace: May 10 22:41:56 dev2 kernel: [24366.190354] ? native_safe_halt+0x6/0x10 May 10 22:41:56 dev2 kernel: [24366.190355] ? default_idle+0x20/0xd0 May 10 22:41:56 dev2 kernel: [24366.190358] ? arch_cpu_idle+0xf/0x20 May 10 22:41:56 dev2 kernel: [24366.190360] ? default_idle_call+0x23/0x30 May 10 22:41:56 dev2 kernel: [24366.190362] ? do_idle+0x16f/0x200 May 10 22:41:56 dev2 kernel: [24366.190364] ? cpu_startup_entry+0x71/0x80 May 10 22:41:56 dev2 kernel: [24366.190366] ? rest_init+0x77/0x80 May 10 22:41:56 dev2 kernel: [24366.190368] ? start_kernel+0x464/0x485 May 10 22:41:56 dev2 kernel: [24366.190369] ? early_idt_handler_array+0x120/0x120 May 10 22:41:56 dev2 kernel: [24366.190371] ? x86_64_start_reservations+0x24/0x26 May 10 22:41:56 dev2 kernel: [24366.190372] ? x86_64_start_kernel+0x14d/0x170 May 10 22:41:56 dev2 kernel: [24366.190373] ? start_cpu+0x14/0x14 May 10 22:44:56 dev2 kernel: [24546.188093] INFO: rcu_sched detected stalls on CPUs/tasks: May 10 22:44:56 dev2 kernel: [24546.189461] 0-...: (1 GPs behind) idle=49b/1/0 softirq=28561/28563 fqs=935027 May 10 22:44:56 dev2 kernel: [24546.190823] (detected by 14, t=1905212 jiffies, g=10001, c=1, q=4740) May 10 22:44:56 dev2 kernel: [24546.192191] Task dump for CPU 0: May 10 22:44:56 dev2 kernel: [24546.192192] swapper/0 R running task 0 0 0 0x0008 May 10 22:44:56 dev2 kernel: [24546.192195] Call Trace: May 10 22:44:56 dev2 kernel: [24546.192199] ? native_safe_halt+0x6/0x10 May 10 22:44:56 dev2 kernel: [24546.192201] ? default_idle+0x20/0xd0 May 10 22:44:56 dev2 kernel: [24546.192203] ? arch_cpu_idle+0xf/0x20 May 10 22:44:56 dev2 kernel: [24546.192204] ? default_idle_call+0x23/0x30 May 10 22:44:56 dev2 kernel: [24546.192206] ? do_idle+0x16f/0x200 May 10 22:44:56 dev2 kernel: [24546.192208] ? cpu_startup_entry+0x71/0x80 May 10 22:44:56 dev2 kernel: [24546.192210] ? rest_init+0x77/0x80 May 10 22:44:56 dev2 kernel: [24546.192211] ? start_kernel+0x464/0x485 May 10 22:44:56 dev2 kernel: [24546.192213] ? early_idt_handler_array+0x120/0x120 May 10 22:44:56 dev2 kernel: [24546.192214] ? x86_64_start_reservations+0x24/0x26 May 10 22:44:56 dev2 kernel: [24546.192215] ? x86_64_start_kernel+0x14d/0x170 May 10 22:44:56 dev2 kernel: [24546.192217] ? start_cpu+0x14/0x14 Depending on the kernel version, we've got NMI watchdog errors related to CPU stuck (mentioning the CPU core id, which is random). Crash is happening randomly, but in general after some hours (3-4h). Now, we've installed kernel 4.11.0-041100-generic #201705041534 this morning and waiting for crash... For now, the machine is not "used", at least, it's not CPU stressed... Thanks --- ApportVersion: 2.20.4-0ubuntu4 Architecture: amd64 DistroRelease: Ubuntu 17.04 InstallationDate: Installed on 2017-05-09 (1 days ago) InstallationMedia: Ubuntu-Server 17.04 "Zesty Zapus" - Release amd64 (20170412) Package: linux (not installed) ProcEnviron: TERM=xterm-256color PATH=(custom, no user) XDG_RUNTIME_DIR= LANG=fr_FR.UTF-8 SHELL=/bin/bash Tags: zesty Uname: Linux 4.11.0-041100-generic x86_64 UnreportableReason: The running kernel is not an Ubuntu kernel UpgradeStatus: No upgrade log present (probably fresh install) UserGroups: _MarkForUpload: True To manage notifications about this bug go to: http
[Kernel-packages] [Bug 1690085]
@Eduardo. Yes I've found that option before. It is disabled by default (so the default is not to try the power saving). I even tried to enable hoping that the power saving system would be smart and avoid the lock up but it didn't work. What I am trying right now is to set the option "Overclocking Enhancement" that is supposed to do "It enables SenseMi Skew (artificially reading lower temperatures in order to trick XFR to boost longer and higher), Performance Bias and tweaks VRM settings to improve overclockability". I am specially interested on the tweaked VRM settings. (https://rog.asus.com/forum/showthread.php?97680-Overclocking- Enhancement) Interesting enough, if I only set this to Enable (no overclock, just default setting + this option enabled) it seems to make things better. As I said, I have a code that consistently triggers a lockup if I run in a loop followed by one hour of inactivity. I have been running this test in one of my system with the "Overclocking Enhancement" set to Enable and it did not trigger the bug for the last 24hs. I have just changed the other machine to test this as well. I'll keep the test running today and the whole weekend on both machines and report back. If anyone else is having the problem with a Asus X399 motherboard it may be worth a try and, please, report back too. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1690085 Title: Ryzen 1800X freeze - rcu_sched detected stalls on CPUs/tasks Status in Linux: Confirmed Status in linux package in Ubuntu: Confirmed Bug description: Hi, We aregetting various kernel crash on a pretty new config. We're using Ryzen 1800X CPU with X370 Gaming Pro Carbon MB (7A32V1) using latest BIOS available (1.52) We are running Ubuntu 17.04 (amd64), we've tried different kernel version, native one and releases from http://kernel.ubuntu.com/~kernel-ppa/mainline/ too. Tested kernel version: native 17.04 kernel 4.10.15 Issues are the same, we're getting random freeze on the machine. Here is kern.log entry when happening : May 10 22:41:56 dev2 kernel: [24366.186246] INFO: rcu_sched detected stalls on CPUs/tasks: May 10 22:41:56 dev2 kernel: [24366.187618] 0-...: (1 GPs behind) idle=49b/1/0 softirq=28561/28563 fqs=913449 May 10 22:41:56 dev2 kernel: [24366.188977] (detected by 12, t=1860207 jiffies, g=10001, c=1, q=4656) May 10 22:41:56 dev2 kernel: [24366.190344] Task dump for CPU 0: May 10 22:41:56 dev2 kernel: [24366.190345] swapper/0 R running task 0 0 0 0x0008 May 10 22:41:56 dev2 kernel: [24366.190348] Call Trace: May 10 22:41:56 dev2 kernel: [24366.190354] ? native_safe_halt+0x6/0x10 May 10 22:41:56 dev2 kernel: [24366.190355] ? default_idle+0x20/0xd0 May 10 22:41:56 dev2 kernel: [24366.190358] ? arch_cpu_idle+0xf/0x20 May 10 22:41:56 dev2 kernel: [24366.190360] ? default_idle_call+0x23/0x30 May 10 22:41:56 dev2 kernel: [24366.190362] ? do_idle+0x16f/0x200 May 10 22:41:56 dev2 kernel: [24366.190364] ? cpu_startup_entry+0x71/0x80 May 10 22:41:56 dev2 kernel: [24366.190366] ? rest_init+0x77/0x80 May 10 22:41:56 dev2 kernel: [24366.190368] ? start_kernel+0x464/0x485 May 10 22:41:56 dev2 kernel: [24366.190369] ? early_idt_handler_array+0x120/0x120 May 10 22:41:56 dev2 kernel: [24366.190371] ? x86_64_start_reservations+0x24/0x26 May 10 22:41:56 dev2 kernel: [24366.190372] ? x86_64_start_kernel+0x14d/0x170 May 10 22:41:56 dev2 kernel: [24366.190373] ? start_cpu+0x14/0x14 May 10 22:44:56 dev2 kernel: [24546.188093] INFO: rcu_sched detected stalls on CPUs/tasks: May 10 22:44:56 dev2 kernel: [24546.189461] 0-...: (1 GPs behind) idle=49b/1/0 softirq=28561/28563 fqs=935027 May 10 22:44:56 dev2 kernel: [24546.190823] (detected by 14, t=1905212 jiffies, g=10001, c=1, q=4740) May 10 22:44:56 dev2 kernel: [24546.192191] Task dump for CPU 0: May 10 22:44:56 dev2 kernel: [24546.192192] swapper/0 R running task 0 0 0 0x0008 May 10 22:44:56 dev2 kernel: [24546.192195] Call Trace: May 10 22:44:56 dev2 kernel: [24546.192199] ? native_safe_halt+0x6/0x10 May 10 22:44:56 dev2 kernel: [24546.192201] ? default_idle+0x20/0xd0 May 10 22:44:56 dev2 kernel: [24546.192203] ? arch_cpu_idle+0xf/0x20 May 10 22:44:56 dev2 kernel: [24546.192204] ? default_idle_call+0x23/0x30 May 10 22:44:56 dev2 kernel: [24546.192206] ? do_idle+0x16f/0x200 May 10 22:44:56 dev2 kernel: [24546.192208] ? cpu_startup_entry+0x71/0x80 May 10 22:44:56 dev2 kernel: [24546.192210] ? rest_init+0x77/0x80 May 10 22:44:56 dev2 kernel: [24546.192211] ? start_kernel+0x464/0x485 May 10 22:44:56 dev2 kernel: [24546.192213] ? early_idt_handler_array+0x120/0x120 May 10 22:44:56 dev2 kernel: [24546.192214] ? x86_64_start_reservations+0x24/0x26 May 10 22:44:56 dev2 kernel: [24546.192215]
[Kernel-packages] [Bug 1690085]
@Ryan, yes. Acoording to dmidecode I am using 0601 BIOS that is the last one available in Asus website. Interesting that you are having soft lockups even with Typical Current. It completely solved the problems in my old 1700X system and it seems to have worked well for others. Are you overclocking? If yes, try first with default settings. Good luck! -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1690085 Title: Ryzen 1800X freeze - rcu_sched detected stalls on CPUs/tasks Status in Linux: Confirmed Status in linux package in Ubuntu: Confirmed Bug description: Hi, We aregetting various kernel crash on a pretty new config. We're using Ryzen 1800X CPU with X370 Gaming Pro Carbon MB (7A32V1) using latest BIOS available (1.52) We are running Ubuntu 17.04 (amd64), we've tried different kernel version, native one and releases from http://kernel.ubuntu.com/~kernel-ppa/mainline/ too. Tested kernel version: native 17.04 kernel 4.10.15 Issues are the same, we're getting random freeze on the machine. Here is kern.log entry when happening : May 10 22:41:56 dev2 kernel: [24366.186246] INFO: rcu_sched detected stalls on CPUs/tasks: May 10 22:41:56 dev2 kernel: [24366.187618] 0-...: (1 GPs behind) idle=49b/1/0 softirq=28561/28563 fqs=913449 May 10 22:41:56 dev2 kernel: [24366.188977] (detected by 12, t=1860207 jiffies, g=10001, c=1, q=4656) May 10 22:41:56 dev2 kernel: [24366.190344] Task dump for CPU 0: May 10 22:41:56 dev2 kernel: [24366.190345] swapper/0 R running task 0 0 0 0x0008 May 10 22:41:56 dev2 kernel: [24366.190348] Call Trace: May 10 22:41:56 dev2 kernel: [24366.190354] ? native_safe_halt+0x6/0x10 May 10 22:41:56 dev2 kernel: [24366.190355] ? default_idle+0x20/0xd0 May 10 22:41:56 dev2 kernel: [24366.190358] ? arch_cpu_idle+0xf/0x20 May 10 22:41:56 dev2 kernel: [24366.190360] ? default_idle_call+0x23/0x30 May 10 22:41:56 dev2 kernel: [24366.190362] ? do_idle+0x16f/0x200 May 10 22:41:56 dev2 kernel: [24366.190364] ? cpu_startup_entry+0x71/0x80 May 10 22:41:56 dev2 kernel: [24366.190366] ? rest_init+0x77/0x80 May 10 22:41:56 dev2 kernel: [24366.190368] ? start_kernel+0x464/0x485 May 10 22:41:56 dev2 kernel: [24366.190369] ? early_idt_handler_array+0x120/0x120 May 10 22:41:56 dev2 kernel: [24366.190371] ? x86_64_start_reservations+0x24/0x26 May 10 22:41:56 dev2 kernel: [24366.190372] ? x86_64_start_kernel+0x14d/0x170 May 10 22:41:56 dev2 kernel: [24366.190373] ? start_cpu+0x14/0x14 May 10 22:44:56 dev2 kernel: [24546.188093] INFO: rcu_sched detected stalls on CPUs/tasks: May 10 22:44:56 dev2 kernel: [24546.189461] 0-...: (1 GPs behind) idle=49b/1/0 softirq=28561/28563 fqs=935027 May 10 22:44:56 dev2 kernel: [24546.190823] (detected by 14, t=1905212 jiffies, g=10001, c=1, q=4740) May 10 22:44:56 dev2 kernel: [24546.192191] Task dump for CPU 0: May 10 22:44:56 dev2 kernel: [24546.192192] swapper/0 R running task 0 0 0 0x0008 May 10 22:44:56 dev2 kernel: [24546.192195] Call Trace: May 10 22:44:56 dev2 kernel: [24546.192199] ? native_safe_halt+0x6/0x10 May 10 22:44:56 dev2 kernel: [24546.192201] ? default_idle+0x20/0xd0 May 10 22:44:56 dev2 kernel: [24546.192203] ? arch_cpu_idle+0xf/0x20 May 10 22:44:56 dev2 kernel: [24546.192204] ? default_idle_call+0x23/0x30 May 10 22:44:56 dev2 kernel: [24546.192206] ? do_idle+0x16f/0x200 May 10 22:44:56 dev2 kernel: [24546.192208] ? cpu_startup_entry+0x71/0x80 May 10 22:44:56 dev2 kernel: [24546.192210] ? rest_init+0x77/0x80 May 10 22:44:56 dev2 kernel: [24546.192211] ? start_kernel+0x464/0x485 May 10 22:44:56 dev2 kernel: [24546.192213] ? early_idt_handler_array+0x120/0x120 May 10 22:44:56 dev2 kernel: [24546.192214] ? x86_64_start_reservations+0x24/0x26 May 10 22:44:56 dev2 kernel: [24546.192215] ? x86_64_start_kernel+0x14d/0x170 May 10 22:44:56 dev2 kernel: [24546.192217] ? start_cpu+0x14/0x14 Depending on the kernel version, we've got NMI watchdog errors related to CPU stuck (mentioning the CPU core id, which is random). Crash is happening randomly, but in general after some hours (3-4h). Now, we've installed kernel 4.11.0-041100-generic #201705041534 this morning and waiting for crash... For now, the machine is not "used", at least, it's not CPU stressed... Thanks --- ApportVersion: 2.20.4-0ubuntu4 Architecture: amd64 DistroRelease: Ubuntu 17.04 InstallationDate: Installed on 2017-05-09 (1 days ago) InstallationMedia: Ubuntu-Server 17.04 "Zesty Zapus" - Release amd64 (20170412) Package: linux (not installed) ProcEnviron: TERM=xterm-256color PATH=(custom, no user) XDG_RUNTIME_DIR= LANG=fr_FR.UTF-8 SHELL=/bin/bash Tags: zesty Uname: Linux 4.11.0-041100-generic x86_64 UnreportableReason: The r
[Kernel-packages] [Bug 1690085]
I would like just to pass some evidence that this affects Threadrippers. I have two identical system with the 1950X on a Asus X399-A Prime motherboard. This motherboard does not have the Power Supply option in the BIOS. Both freeze from time to time. But more interesting, both systems freeze when running a code developed by one of our students that uses Matlab and computes the SVD of a large matrix followed by some time almost idle writing to a NFS mount many times in a loop. The code takes around an hour to run. I made a small script loops between running the code followed by calling sleep for one extra hour to put the machine in a idle state. It always make the machines freeze in less than one day. In some cases the freeze happens even while the code is running, not only when idle. This seems odd. It is very reproducible. The solution is to disable C6 fully using zenstates.py. Only disabling c6-packaging is not enough. Another possibility, that I am inclined is to recover some of the performance I lost by loosing turbo boost and XFR, is a light overclock. I am testing one of the machines now @3.75GHz. Have anyone seen the "power supply" option in a X399 motherboard? -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1690085 Title: Ryzen 1800X freeze - rcu_sched detected stalls on CPUs/tasks Status in Linux: Confirmed Status in linux package in Ubuntu: Confirmed Bug description: Hi, We aregetting various kernel crash on a pretty new config. We're using Ryzen 1800X CPU with X370 Gaming Pro Carbon MB (7A32V1) using latest BIOS available (1.52) We are running Ubuntu 17.04 (amd64), we've tried different kernel version, native one and releases from http://kernel.ubuntu.com/~kernel-ppa/mainline/ too. Tested kernel version: native 17.04 kernel 4.10.15 Issues are the same, we're getting random freeze on the machine. Here is kern.log entry when happening : May 10 22:41:56 dev2 kernel: [24366.186246] INFO: rcu_sched detected stalls on CPUs/tasks: May 10 22:41:56 dev2 kernel: [24366.187618] 0-...: (1 GPs behind) idle=49b/1/0 softirq=28561/28563 fqs=913449 May 10 22:41:56 dev2 kernel: [24366.188977] (detected by 12, t=1860207 jiffies, g=10001, c=1, q=4656) May 10 22:41:56 dev2 kernel: [24366.190344] Task dump for CPU 0: May 10 22:41:56 dev2 kernel: [24366.190345] swapper/0 R running task 0 0 0 0x0008 May 10 22:41:56 dev2 kernel: [24366.190348] Call Trace: May 10 22:41:56 dev2 kernel: [24366.190354] ? native_safe_halt+0x6/0x10 May 10 22:41:56 dev2 kernel: [24366.190355] ? default_idle+0x20/0xd0 May 10 22:41:56 dev2 kernel: [24366.190358] ? arch_cpu_idle+0xf/0x20 May 10 22:41:56 dev2 kernel: [24366.190360] ? default_idle_call+0x23/0x30 May 10 22:41:56 dev2 kernel: [24366.190362] ? do_idle+0x16f/0x200 May 10 22:41:56 dev2 kernel: [24366.190364] ? cpu_startup_entry+0x71/0x80 May 10 22:41:56 dev2 kernel: [24366.190366] ? rest_init+0x77/0x80 May 10 22:41:56 dev2 kernel: [24366.190368] ? start_kernel+0x464/0x485 May 10 22:41:56 dev2 kernel: [24366.190369] ? early_idt_handler_array+0x120/0x120 May 10 22:41:56 dev2 kernel: [24366.190371] ? x86_64_start_reservations+0x24/0x26 May 10 22:41:56 dev2 kernel: [24366.190372] ? x86_64_start_kernel+0x14d/0x170 May 10 22:41:56 dev2 kernel: [24366.190373] ? start_cpu+0x14/0x14 May 10 22:44:56 dev2 kernel: [24546.188093] INFO: rcu_sched detected stalls on CPUs/tasks: May 10 22:44:56 dev2 kernel: [24546.189461] 0-...: (1 GPs behind) idle=49b/1/0 softirq=28561/28563 fqs=935027 May 10 22:44:56 dev2 kernel: [24546.190823] (detected by 14, t=1905212 jiffies, g=10001, c=1, q=4740) May 10 22:44:56 dev2 kernel: [24546.192191] Task dump for CPU 0: May 10 22:44:56 dev2 kernel: [24546.192192] swapper/0 R running task 0 0 0 0x0008 May 10 22:44:56 dev2 kernel: [24546.192195] Call Trace: May 10 22:44:56 dev2 kernel: [24546.192199] ? native_safe_halt+0x6/0x10 May 10 22:44:56 dev2 kernel: [24546.192201] ? default_idle+0x20/0xd0 May 10 22:44:56 dev2 kernel: [24546.192203] ? arch_cpu_idle+0xf/0x20 May 10 22:44:56 dev2 kernel: [24546.192204] ? default_idle_call+0x23/0x30 May 10 22:44:56 dev2 kernel: [24546.192206] ? do_idle+0x16f/0x200 May 10 22:44:56 dev2 kernel: [24546.192208] ? cpu_startup_entry+0x71/0x80 May 10 22:44:56 dev2 kernel: [24546.192210] ? rest_init+0x77/0x80 May 10 22:44:56 dev2 kernel: [24546.192211] ? start_kernel+0x464/0x485 May 10 22:44:56 dev2 kernel: [24546.192213] ? early_idt_handler_array+0x120/0x120 May 10 22:44:56 dev2 kernel: [24546.192214] ? x86_64_start_reservations+0x24/0x26 May 10 22:44:56 dev2 kernel: [24546.192215] ? x86_64_start_kernel+0x14d/0x170 May 10 22:44:56 dev2 kernel: [24546.192217] ? start_cpu+0x14/0x14 Depending on the kernel version,