On Wed, 5 Feb 2020 22:31:52 -0700
Aaron Bieber <aa...@bolddaemon.com> wrote:

> On Wed, 05 Feb 2020 at 20:29:31 +0100, William Orr wrote:
> > 
> > Hey,
> > 
> > On recent a snap (04/02/2020), the unpriv'ed process of Xorg seems to hang,
> > becoming totally unresponsive. Running `ktrace` on the process fails to log
> > any output. `top` shows that the process is waiting on `fsleep`. I'm using 
> > the
> > amdgpu driver.
> 
> Similar issue here. It seems to happen randomly (possibly more often under 
> high
> memory usage). It's always after X has started and I have been using it for
> some time (days sometimes).
> 
> MPD will continue to play music in the background and pressing the power 
> button
> for a few seconds seems to result in a shutdown, however, it doesn't quite
> shutdown properly. The screen will go blank and the fans will start to spin at
> full speed. At which point holding the power button seems to be the only fix.

I studied my problem with startxfce4 (where Xorg gets stuck and I use
Ctrl+Alt+BackSpace to reset Xorg), but that is a different bug,
not an amdgpu glitch.

Today, I froze Xorg in a different way.  I was stressing supertuxkart
on my amdgpu machine by playing at full screen (1920x1080), graphics
setting 6, and 19 AI karts.  This sometimes causes a visual glitch
where objects in the game either disappear or cast large black shadows.
There is a LOADING screen before each race.  The LOADING screen seems
to decide the amount of glitches in each race: none, few, or many.
If I reload the track, I may have more or fewer glitches.

Today, my last race got stuck at the LOADING screen.  Xorg stopped
responding to the keyboard: Ctrl+Alt+F4 (to switch virtual console)
didn't work.  The system was still alive: ping(8) and ssh(1) continued
to work (from a second computer to the amdgpu machine).  In the ssh(1)
session, top(1) showed one thread of supertuxkart being consistently
"onproc" even though the machine was mostly idle.  I became root and
attached egdb (from package gdb-7.12.1p9) to supertuxkart.

The thread seemed to be stuck in DRM_IOCTL_AMDGPU_WAIT_CS, called from
/usr/xenocara/lib/libdrm/amdgpu/amdgpu_cs.c; this appears to call
/sys/dev/pci/drm/amd/amdgpu_cs.c amdgpu_cs_wait_ioctl().

I detached egdb, then told top(1) to kill supertuxkart.  The system
stopped answering ping(8), and top(1) froze.  In top(1), supertuxkart
had WAIT "drmweti" and Xorg had wait "dmafenc".  I forced a reboot.

The rest of this mail is a backtrace of one thread of
supertuxkart-0.9.3p0 (copy from photo, so beware of typos).  --George

(gdb) bt
#0  ioctl () at -:3
#1  0x000006d86059e3c0 in drmIoctl () from /usr/X11R6/lib/libdrm.so.7.8
#2  0x000006d941e83739 in amdgpu_cs_query_fence_status () from
        /usr/X11R6/lib/libdrm_amdgpu.so.1.9
#3  0x000006d8f800e951 in amdgpu_fence_wait () from
        /usr/X11R6/lib/modules/dri/radeon_dri.so
#4  0x000006d8f7f448a6 in si_fence_finish () from
        /usr/X11R6/lib/modules/dri/radeon_dri.so
#5  0x000006d8f79f04d3 in st_client_wait_sync () from
        /usr/X11R6/lib/modules/dri/radeonsi_dri.so
#6  0x000006d8f793136e in _mesa_ClientWaitSync () from
        /usr/X11R6/lib/modules/dri/radeonsi_dri.so
#7  0x000006d65d7c48d7 in DrawCalls::prepareDrawCalls(ShadowMatrices&,
        irr::scene::ICameraSceneNode
#8  0x000006d65d887aee in ShaderBasedRenderer::renderScene(
        irr::scene::ICameraSceneNode*, float, bool, bool) ()
#9  0x000006d65d88a5c3 in ShaderBasedRenderer::render(float) ()
#10 0x000006d65d8068ed in IrrDriver::update(float) ()
#11 0x000006d65d9eaa0d in MainLoop::run() ()
#12 0x000006d65d9e74d0 in main ()
(gdb) info registers
rax     0x36            54
rbx     0x16e2a71b28d7  25162721994967
rcx     0x6d88cf49a3a   7527147543098
rdx     0x7f7ffffdc508  140187732395272
rsi     0xc0206449      3223348297      # DRM_IOCTL_AMDGPU_WAIT_CS
rdi     0x8             8
rbp     0x7f7ffffdc4e0  0x7f7ffffdc4e0
rsp     0x7f7ffffdc4b8  0x7f7ffffdc4b8
r8      0x6d88cf85cf8   7527147789560
r9      0x0             0
r10     0x0             0
r11     0x246           582
r12     0x8             8
r13     0x16e2a71b28d7  25162721994967
r14     0x7f7fffdc508   140187732395272
r15     0xc0206449      3223348297
rip     0x6d88cf49a3a   0x6d88cf49a3a <ioctl+10>

Reply via email to