(In reply to Ashesh Ambasta from comment #699)

> Created attachment 290243 [details]
> attachment-20700-0.html

I am getting:

> Sorry, you are not authorized to access attachment #290243.

[…]

> AMD did publish an errata in which they claim an issue like this exists; 
> but a solution is ruled out. Which is further bad news. There was some 
> discussion on a fix being at the kernel level, but that isn't anywhere 
> in sight either. I believe these CPU's are plagued by several issues, 
> which probably makes a fix for this at the kernel level hard. However, 
> Windows seems to have managed to fix it.

That’s interesting. Could you please share the URL to these statements?

[…]

raulvior.bcn, Asheesh: Please always remove the cited parts, as
otherwise the Web issue page gets even more convoluted. Thanks.


> Anyway; rants aside, this is my current take on the CPU.
> 
> On 7/12/20 3:58 PM, bugzilla-dae...@bugzilla.kernel.org wrote:
> > https://bugzilla.kernel.org/show_bug.cgi?id=196683
> >
> > --- Comment #698 from raulvior....@gmail.com ---
> > Did you ask for an RMA? Did it work?
> > (In reply to Ashesh Ambasta from comment #692)
> >> As a last resort, I've tried `idle=halt` on this machine. And yet my
> >> system just crashed after 3 weeks of uptime.
> >>
> >> I'm done with AMD. I will RMA this processor to try things out, but
> >> overall, if that doesn't work, this thing is headed to the junkyard and
> >> I'm going to live with Intel.
> >>
> >> At least in the 13 or so odd Intel systems I've tried, I've not had
> >> exhasperating issues like these where the company is positively trying
> >> to ignore this ongoing issue.
> >>
> >> This is disgusting from AMD.
> >>
> >> On 6/11/20 6:03 PM, bugzilla-dae...@bugzilla.kernel.org wrote:
> >>> https://bugzilla.kernel.org/show_bug.cgi?id=196683
> >>>
> >>> --- Comment #689 from raulvior....@gmail.com ---
> >>> (In reply to raulvior.bcn from comment #653)
> >>>> (In reply to txrx from comment #651)
> >>>>
> >>>> Typical Current Idle might not be working. Read the sensor output. If
> >>>> voltage is not higher than without enabling it, try to increase the core
> >>>> voltage.
> >>>>
> >>>> My Ryzen 7 1800X seems to not produce hangs since I upgraded to 1003ABB
> >> with
> >>>> an ASUS Crosshair VI Hero and enabled Typical current idle.
> >>>>
> >>>>
> >>>>> I was able to update my BIOS to version 18, but my system still locks
> up.
> >>>>> I tried the following with the new BIOS:
> >>>>>    - use factory defaults
> >>>>>    - disable SMT
> >>>>>    - disable SMT with Typical Current Idle
> >>>>>    - all of the above with SVM disabled/enabled
> >>>>> Right now I set the power supply idle control to "Low ..." and will
> >> report
> >>>>> back.
> >>>>>
> >>> The motherboard kept hanging. I had to remove the Vitals GNOME Extension.
> >> It
> >>> seems that polling voltage values hangs the motherboard... Still, there
> are
> >>> times that the computer does not come back from suspend. There's
> something
> >>> wrong with the BIOS/UEFI.
> >>>

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1690085

Title:
  Ryzen 1800X freeze - rcu_sched detected stalls on CPUs/tasks

Status in Linux:
  Expired
Status in linux package in Ubuntu:
  Confirmed

Bug description:
  Hi,

  
  We aregetting various kernel crash on a pretty new config.
  We're using Ryzen 1800X CPU with X370 Gaming Pro Carbon MB (7A32V1) using 
latest BIOS available (1.52)

  We are running Ubuntu 17.04 (amd64), we've tried different kernel version, 
native one and releases from http://kernel.ubuntu.com/~kernel-ppa/mainline/ too.
  Tested kernel version:

  native 17.04 kernel
  4.10.15

  Issues are the same, we're getting random freeze on the machine.

  Here is kern.log entry when happening :

  May 10 22:41:56 dev2 kernel: [24366.186246] INFO: rcu_sched detected stalls 
on CPUs/tasks:
  May 10 22:41:56 dev2 kernel: [24366.187618]     0-...: (1 GPs behind) 
idle=49b/1/0 softirq=28561/28563 fqs=913449
  May 10 22:41:56 dev2 kernel: [24366.188977]     (detected by 12, t=1860207 
jiffies, g=10001, c=10000, q=4656)
  May 10 22:41:56 dev2 kernel: [24366.190344] Task dump for CPU 0:
  May 10 22:41:56 dev2 kernel: [24366.190345] swapper/0       R  running task   
     0     0      0 0x00000008
  May 10 22:41:56 dev2 kernel: [24366.190348] Call Trace:
  May 10 22:41:56 dev2 kernel: [24366.190354]  ? native_safe_halt+0x6/0x10
  May 10 22:41:56 dev2 kernel: [24366.190355]  ? default_idle+0x20/0xd0
  May 10 22:41:56 dev2 kernel: [24366.190358]  ? arch_cpu_idle+0xf/0x20
  May 10 22:41:56 dev2 kernel: [24366.190360]  ? default_idle_call+0x23/0x30
  May 10 22:41:56 dev2 kernel: [24366.190362]  ? do_idle+0x16f/0x200
  May 10 22:41:56 dev2 kernel: [24366.190364]  ? cpu_startup_entry+0x71/0x80
  May 10 22:41:56 dev2 kernel: [24366.190366]  ? rest_init+0x77/0x80
  May 10 22:41:56 dev2 kernel: [24366.190368]  ? start_kernel+0x464/0x485
  May 10 22:41:56 dev2 kernel: [24366.190369]  ? 
early_idt_handler_array+0x120/0x120
  May 10 22:41:56 dev2 kernel: [24366.190371]  ? 
x86_64_start_reservations+0x24/0x26
  May 10 22:41:56 dev2 kernel: [24366.190372]  ? x86_64_start_kernel+0x14d/0x170
  May 10 22:41:56 dev2 kernel: [24366.190373]  ? start_cpu+0x14/0x14
  May 10 22:44:56 dev2 kernel: [24546.188093] INFO: rcu_sched detected stalls 
on CPUs/tasks:
  May 10 22:44:56 dev2 kernel: [24546.189461]     0-...: (1 GPs behind) 
idle=49b/1/0 softirq=28561/28563 fqs=935027
  May 10 22:44:56 dev2 kernel: [24546.190823]     (detected by 14, t=1905212 
jiffies, g=10001, c=10000, q=4740)
  May 10 22:44:56 dev2 kernel: [24546.192191] Task dump for CPU 0:
  May 10 22:44:56 dev2 kernel: [24546.192192] swapper/0       R  running task   
     0     0      0 0x00000008
  May 10 22:44:56 dev2 kernel: [24546.192195] Call Trace:
  May 10 22:44:56 dev2 kernel: [24546.192199]  ? native_safe_halt+0x6/0x10
  May 10 22:44:56 dev2 kernel: [24546.192201]  ? default_idle+0x20/0xd0
  May 10 22:44:56 dev2 kernel: [24546.192203]  ? arch_cpu_idle+0xf/0x20
  May 10 22:44:56 dev2 kernel: [24546.192204]  ? default_idle_call+0x23/0x30
  May 10 22:44:56 dev2 kernel: [24546.192206]  ? do_idle+0x16f/0x200
  May 10 22:44:56 dev2 kernel: [24546.192208]  ? cpu_startup_entry+0x71/0x80
  May 10 22:44:56 dev2 kernel: [24546.192210]  ? rest_init+0x77/0x80
  May 10 22:44:56 dev2 kernel: [24546.192211]  ? start_kernel+0x464/0x485
  May 10 22:44:56 dev2 kernel: [24546.192213]  ? 
early_idt_handler_array+0x120/0x120
  May 10 22:44:56 dev2 kernel: [24546.192214]  ? 
x86_64_start_reservations+0x24/0x26
  May 10 22:44:56 dev2 kernel: [24546.192215]  ? x86_64_start_kernel+0x14d/0x170
  May 10 22:44:56 dev2 kernel: [24546.192217]  ? start_cpu+0x14/0x14

  Depending on the kernel version, we've got NMI watchdog errors related to CPU 
stuck (mentioning the CPU core id, which is random).
  Crash is happening randomly, but in general after some hours (3-4h).

  Now, we've installed kernel 4.11.0-041100-generic #201705041534 this morning 
and waiting for crash...
  For now, the machine is not "used", at least, it's not CPU stressed...

  
  Thanks
  --- 
  ApportVersion: 2.20.4-0ubuntu4
  Architecture: amd64
  DistroRelease: Ubuntu 17.04
  InstallationDate: Installed on 2017-05-09 (1 days ago)
  InstallationMedia: Ubuntu-Server 17.04 "Zesty Zapus" - Release amd64 
(20170412)
  Package: linux (not installed)
  ProcEnviron:
   TERM=xterm-256color
   PATH=(custom, no user)
   XDG_RUNTIME_DIR=<set>
   LANG=fr_FR.UTF-8
   SHELL=/bin/bash
  Tags:  zesty
  Uname: Linux 4.11.0-041100-generic x86_64
  UnreportableReason: The running kernel is not an Ubuntu kernel
  UpgradeStatus: No upgrade log present (probably fresh install)
  UserGroups:
   
  _MarkForUpload: True

To manage notifications about this bug go to:
https://bugs.launchpad.net/linux/+bug/1690085/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to