This may be EFI bios related / amdgpu.ko kernel driver dpm powerplay
related.

The dmesg.txt attached seems to confirm this in this ticket.

The problem is not software, it's overheating / hardware related. It's
affecting AMI EFI bios code + AMD GPU SBIOS powerplay voltage tables.

See this thread.
https://forum-en.msi.com/index.php?topic=298468.0

Here is an email I just sent to msi support about this saga, as they are
trying to replicate this defect as it's no good when fans either don't
spin up on some cards, or stay locked at 1000rpm out of 3000rpm. No good
= the card crashing corruption etc software issues. This user needs to
monitor his fan speed / temperature when using the GPU. I have emails
about how to that as well if he wants.


I'm using sabayon.org Linux out of the box to reproduce the error. Latest 
version I think 18.02 or 17.xx from memory.

╠      @@ Package: sys-kernel/linux-sabayon-4.14.12-r1 branch: 5, 
[sabayon-weekly]
╠          Available:     version: 4.14.12-r1 ~ tag: NoTag ~ revision: 0
╠          Installed:     version: 4.14.12-r1 ~ tag: NoTag ~ revision: 0
╠          Slot:          4.14
╠          Homepage:      https://github.com/Sabayon/kernel
╠          Description:   Official Sabayon Linux Standard
╠                         kernel image

But i'm moving to pure gentoo.org custom compiled everything. Right now
whilst i'm transitioning I"m using an out of the box kernel.

All it matters is if upstream Linux is broken then many linux
distributions by default are broken. And this admgpu driver not handling
modified voltage tables is going to kill cards running linux.


Ubuntu's kernel may have a custom patch to fix this already, that's why you 
can't replicate it now. let me look at the sources / patches for ubuntu kernel 
now.

https://bugs.freedesktop.org/show_bug.cgi?id=100443

See the original post I put up, that's other people getting it.

Someone else on the internet with the bug in Ubuntu:
http://www.cadalyst.com/%5Blevel-1-with-primary-path%5D/rx-850-ubuntu-driver-problems-34295
 

"
RX 580 Ubuntu Driver Problems
Wed, 05/31/2017 - 00:40 — Anonymous

I just got a pair of XFX RX 580 card and am having some trouble with the
amdgpu-pro drivers.

On a fresh install of Ubuntu 16.04.2, after updating packages, updating
the kernel to 4.8, installing the AMDGPU-PRO 17.10 drivers, and
rebooting, I see these messages in the log

 "
Maybe it's been patched in more recent ubuntu versions.

The goal is to work around the bug in firmware so people who run older
linuxes / unpatched don't get hit. Anything to mitigate this overheating
issue either at the linux kernel or firmware is a good move.

He used 16.04.2 without a kernel update from the CD.

But Ubuntu 16.04.2 has other problems....
znmeb commented on 2017-12-04 23:05

This software doesn't even work on Ubuntu 16.04.3 LTS, which AMD
supposedly supports!! See http://support.amd.com/en-us/kb-articles/Pages
/AMDGPU-PRO-Driver-Compatibility-Advisory-with-
Ubuntu-16.04.2-and-16.04.3.aspx

I couldn't even make it work on 16.04.2. I've pretty much given up on AMD.
But could be that.... Maybe they are putting the spin on the powerplay issue as 
something else. Remember AMDGPU-Pro isn't reconmended for most of the AMD GPU 
chipsets you sell.  AmdGPU "all open" they also bundle with ubuntu is just 
their own unmodified but compiled version pure opensource amdgpu.ko kernel 
driver, linux-firmware for the polaris10-11, mesa opengl - vaapi etc


You want to do your testing on VANILLA Linux kernels. I .e. learn how to 
compile a vanilla kernel in ubuntu or use vanilla kernel source. That's what 
all the other non-ubuntu linux distributions are using.

They have ready pre-cooked ones for you here:

https://wiki.ubuntu.com/Kernel/MainlineBuilds

Does it break with any of the < 4.14 Linux kernels from there? Maybe
Ubuntu's patched it already. I'll have a quick look on their bug tracker
to see if I can find it.....

<10 mins later.... >

Hey... I just found someone else yet again who video card was
crashing.... Who was running Ubuntu... who had no fan control.. Using
the ubuntu bug database. I told you it's a global problem. All these
errors with the video cards crashing at high res, in games in videos etc
is OVERHEATING issues being reported to linux bugtrackers as "software
issues". It's a mess...

https://bugs.launchpad.net/ubuntu/+source/xorg/+bug/1740484

Look at the dmesg.txt attached to that bug... the error message I kept raving 
about about powerplay votlage tables...
https://launchpadlibrarian.net/351532328/CurrentDmesg.txt

He's also got EFI firmware from AMI like you guys use, but not an MSI
card. It's AMI EFI firmware + messed voltage tables + linux kernel =
dead powerplay no fan control I think.

That's me searching the UBUNTU bugs database for 10 mins guys so you
know it's real. Try using an older motherboard like the same as mine
with OLDER AMI EFI bios driver version. That EFI version will be right
at the top of the dmesg. Details on my motherboard etc are in the
original post. Yo

[    0.000000] efi: EFI v2.60 by American Megatrends
[    0.000000] efi:  ACPI 2.0=0xd48df000  ACPI=0xd48df000  SMBIOS=0xdb92f000  
SMBIOS 3.0=0xdb92e000  ESRT=0xd815b998 
[    0.000000] random: fast init done
[    0.000000] SMBIOS 3.0.0 present.
[    0.000000] DMI: System manufacturer System Product Name/PRIME B350M-A, BIOS 
3401 12/04/2017

Finding exactly what is required to replicate this at your end will be
useful.

Just search amdgpu in the ubuntu bug ticket to see other users who are
suffering.

https://bugs.launchpad.net/ubuntu?field.searchtext=amdgpu&search=Search&field.status%3Alist=NEW&field.status%3Alist=INCOMPLETE_WITH_RESPONSE&field.status%3Alist=INCOMPLETE_WITHOUT_RESPONSE&field.status%3Alist=CONFIRMED&field.status%3Alist=TRIAGED&field.status%3Alist=INPROGRESS&field.status%3Alist=FIXCOMMITTED&field.assignee=&field.bug_reporter=&field.omit_dupes=on&field.has_patch=&field.has_no_package
=

Cheers,

Luke


    Среда, 24 января 2018, 16:10 +07:00 от MSI OCSS <msih...@msi.com>:

    Ticket:
    MSI/AMI EFI-ACPI firmware & MSI GPU SBIOS/ATOMBIOS break AMD Power Play in 
Stable linux - request new firmware builds for me (and others)

    Content:
    Hi, Sir,
    As you know, there are two install file in the AMD  lLinux driver package, 
'amdgpu-pro-install' and 'amdgpu-install', 'amdgpu-pro-install' is for Radeon 
Pro GPU which is for use in workstations, and amdgpu-intall is for all other 
product.
    And we are endeavoring to reproduce the fan issue with Linux, although we 
had tried it with Ubuntu, but the fan work fine without any settings.
    For gather more information, as you said before, did the issue fix while 
amdgpu.dc=1 enabled? Thanks.


** Bug watch added: freedesktop.org Bugzilla #100443
   https://bugs.freedesktop.org/show_bug.cgi?id=100443

-- 
You received this bug notification because you are a member of Desktop
Packages, which is subscribed to xorg in Ubuntu.
https://bugs.launchpad.net/bugs/1740484

Title:
  video corruption with amd RX560 and 2k display

Status in xorg package in Ubuntu:
  New

Bug description:
  I have a new PC. Freshly installed ubuntu 17.10

  I get a lot of video corruption on my 2k display. It usually shows up
  as bands of the background image or gnome-shell status bar flickering
  across the display at different vertical positions. Everything works
  fine if I select 'Gnome under X.Org' from the login screen instead of
  wayland.

  GNOME 3.26.2, Linux 4.13, Radeon RX560 using amdgpu, Lenovo l24q-10
  monitor running at 2k (2560x1440@ 59.95Hz).

  There is a Fedora Bug with same info:
  https://bugzilla.redhat.com/show_bug.cgi?id=1417778 with a video
  showing the problem in there.

  ProblemType: Bug
  DistroRelease: Ubuntu 17.10
  Package: xorg 1:7.7+19ubuntu3
  ProcVersionSignature: Ubuntu 4.13.0-21.24-generic 4.13.13
  Uname: Linux 4.13.0-21-generic x86_64
  NonfreeKernelModules: wl
  ApportVersion: 2.20.7-0ubuntu3.6
  Architecture: amd64
  CompositorRunning: None
  CurrentDesktop: ubuntu:GNOME
  Date: Fri Dec 29 11:06:18 2017
  DistUpgraded: Fresh install
  DistroCodename: artful
  DistroVariant: ubuntu
  DkmsStatus: bcmwl, 6.30.223.271+bdcom, 4.13.0-21-generic, x86_64: installed
  ExtraDebuggingInterest: No
  GraphicsCard:
   Advanced Micro Devices, Inc. [AMD/ATI] Baffin [Polaris11] [1002:67ff] (rev 
cf) (prog-if 00 [VGA controller])
     Subsystem: Sapphire Technology Limited Baffin [Radeon RX 560] [1da2:e348]
  MachineType: System manufacturer System Product Name
  ProcEnviron:
   TERM=xterm-256color
   PATH=(custom, no user)
   XDG_RUNTIME_DIR=<set>
   LANG=es_ES.UTF-8
   SHELL=/bin/bash
  ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-4.13.0-21-generic.efi.signed 
root=UUID=2452d0da-14e9-4f88-9660-5a86aee79ff4 ro quiet splash vt.handoff=7
  SourcePackage: xorg
  Symptom: display
  UpgradeStatus: No upgrade log present (probably fresh install)
  dmi.bios.date: 12/04/2017
  dmi.bios.vendor: American Megatrends Inc.
  dmi.bios.version: 3401
  dmi.board.asset.tag: Default string
  dmi.board.name: PRIME B350M-A
  dmi.board.vendor: ASUSTeK COMPUTER INC.
  dmi.board.version: Rev X.0x
  dmi.chassis.asset.tag: Default string
  dmi.chassis.type: 3
  dmi.chassis.vendor: Default string
  dmi.chassis.version: Default string
  dmi.modalias: 
dmi:bvnAmericanMegatrendsInc.:bvr3401:bd12/04/2017:svnSystemmanufacturer:pnSystemProductName:pvrSystemVersion:rvnASUSTeKCOMPUTERINC.:rnPRIMEB350M-A:rvrRevX.0x:cvnDefaultstring:ct3:cvrDefaultstring:
  dmi.product.family: To be filled by O.E.M.
  dmi.product.name: System Product Name
  dmi.product.version: System Version
  dmi.sys.vendor: System manufacturer
  version.compiz: compiz N/A
  version.libdrm2: libdrm2 2.4.83-1
  version.libgl1-mesa-dri: libgl1-mesa-dri 17.2.2-0ubuntu1
  version.libgl1-mesa-glx: libgl1-mesa-glx 17.2.2-0ubuntu1
  version.xserver-xorg-core: xserver-xorg-core 2:1.19.5-0ubuntu2
  version.xserver-xorg-input-evdev: xserver-xorg-input-evdev N/A
  version.xserver-xorg-video-ati: xserver-xorg-video-ati 1:7.10.0-1
  version.xserver-xorg-video-intel: xserver-xorg-video-intel 
2:2.99.917+git20170309-0ubuntu1
  version.xserver-xorg-video-nouveau: xserver-xorg-video-nouveau 1:1.0.15-2

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/xorg/+bug/1740484/+subscriptions

-- 
Mailing list: https://launchpad.net/~desktop-packages
Post to     : desktop-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~desktop-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to