Re: General protection fault: RIP: 0010:free_block+0xdc/0x1f0

2020-09-15 Thread Dave Airlie
cc'ing some more people.

On Tue, 15 Sep 2020 at 23:07, Paul Menzel  wrote:
>
> Dear Andrew folks, dear Linux folks,
>
>
> With Linux 5.9-rc4 on a Dell OptiPlex 5080 with Intel Core i7-10700 CPU
> @ 2.90GHz, and external
>
>  01:00.0 VGA compatible controller [0300]: Advanced Micro Devices,
> Inc. [AMD/ATI] Oland [Radeon HD 8570 / R7 240/340 OEM] [1002:6611] (rev 87)
>
> running graphical demanding applications glmark2 [1] and the Phoronix
> Test Suite [2] benchmark *pts/desktop-graphics* [3]
>
>  $ git describe --tags
>  v10.0.0m1-13-g0b5ddc3c0
>
> I got three general protection faults, and it restarted or froze (no
> input devices working, screen froze and even network card (no ping)).
>
> Here the system restarted itself:
>
> > kernel: general protection fault, probably for non-canonical address 
> > 0xdead0100:  [#1] SMP NOPTI
> > kernel: CPU: 2 PID: 9702 Comm: glmark2 Kdump: loaded Not tainted 
> > 5.9.0-rc4.mx64.343 #1
> > kernel: Hardware name: Dell Inc. OptiPlex 5080/032W55, BIOS 1.1.7 08/17/2020
> > kernel: RIP: 0010:free_block+0xdc/0x1f0
>
> Here it froze:
>
> > [14639.665745] general protection fault, probably for non-canonical address 
> > 0xdead0100:  [#1] SMP NOPTI
> > [14639.675917] CPU: 15 PID: 23094 Comm: pvpython Kdump: loaded Not tainted 
> > 5.9.0-rc4.mx64.343 #1
> > [14639.684431] Hardware name: Dell Inc. OptiPlex 5080/032W55, BIOS 1.1.7 
> > 08/17/2020
> > [14639.691823] RIP: 0010:free_block+0xdc/0x1f0
>
> Here it froze:
>
> > kernel: general protection fault, probably for non-canonical address 
> > 0xdead0100:  [#1] SMP NOPTI
> > kernel: CPU: 15 PID: 23094 Comm: pvpython Kdump: loaded Not tainted 
> > 5.9.0-rc4.mx64.343 #1
> > kernel: Hardware name: Dell Inc. OptiPlex 5080/032W55, BIOS 1.1.7 08/17/2020
> > kernel: RIP: 0010:free_block+0xdc/0x1f0
>
> Running `scripts/decode_stacktrace.sh`:
>
> > linux-5.9_rc4-343.x86_64/source$ scripts/decode_stacktrace.sh vmlinux < 
> > optiplex-5080-linux-5.9-rc4-gp-pvpython.txt
> > [14528.718656] cgroup: fork rejected by pids controller in 
> > /user.slice/user-5272.slice/session-c6.scope
> > [14639.665745] general protection fault, probably for non-canonical address 
> > 0xdead0100:  [#1] SMP NOPTI
> > [14639.675917] CPU: 15 PID: 23094 Comm: pvpython Kdump: loaded Not tainted 
> > 5.9.0-rc4.mx64.343 #1
> > [14639.684431] Hardware name: Dell Inc. OptiPlex 5080/032W55, BIOS 1.1.7 
> > 08/17/2020
> > [14639.691823] RIP: 0010:free_block (./include/linux/list.h:112 
> > ./include/linux/list.h:135 ./include/linux/list.h:146 mm/slab.c:3336)
> > [14639.696006] Code: 00 48 01 d0 48 c1 e8 0c 48 c1 e0 06 4c 01 e8 48 8b 50 
> > 08 48 8d 4a ff 83 e2 01 48 0f 45 c1 48 8b 48 08 48 8b 50 10 4c 8d 78 08 
> > <48> 89 51 08 48 89 0a 4c 89 da 48 2b 50 28 4c 89 60 08 48 89 68 10
> > All code
> > 
> >0: 00 48 01add%cl,0x1(%rax)
> >3: d0 48 c1rorb   -0x3f(%rax)
> >6: e8 0c 48 c1 e0  callq  0xe0c14817
> >b: 06  (bad)
> >c: 4c 01 e8add%r13,%rax
> >f: 48 8b 50 08 mov0x8(%rax),%rdx
> >   13: 48 8d 4a ff lea-0x1(%rdx),%rcx
> >   17: 83 e2 01and$0x1,%edx
> >   1a: 48 0f 45 c1 cmovne %rcx,%rax
> >   1e: 48 8b 48 08 mov0x8(%rax),%rcx
> >   22: 48 8b 50 10 mov0x10(%rax),%rdx
> >   26: 4c 8d 78 08 lea0x8(%rax),%r15
> >   2a:*48 89 51 08 mov%rdx,0x8(%rcx)   <-- 
> > trapping instruction
> >   2e: 48 89 0amov%rcx,(%rdx)
> >   31: 4c 89 damov%r11,%rdx
> >   34: 48 2b 50 28 sub0x28(%rax),%rdx
> >   38: 4c 89 60 08 mov%r12,0x8(%rax)
> >   3c: 48 89 68 10 mov%rbp,0x10(%rax)
> >
> > Code starting with the faulting instruction
> > ===
> >0: 48 89 51 08 mov%rdx,0x8(%rcx)
> >4: 48 89 0amov%rcx,(%rdx)
> >7: 4c 89 damov%r11,%rdx
> >a: 48 2b 50 28 sub0x28(%rax),%rdx
> >e: 4c 89 60 08 mov%r12,0x8(%rax)
> >   12: 48 89 68 10 mov%rbp,0x10(%rax)
> > [14639.714747] RSP: 0018:c9001c26fab8 EFLAGS: 00010046
> > [14639.719970] RAX: ea000d193600 RBX: 8000 RCX: 
> > dead0100
> > [14639.727099] RDX: dead0122 RSI: 88842d5f3ef0 RDI: 
> > 88842b440300
> > [14639.734225] RBP: dead0122 R08: c9001c26fb30 R09: 
> > 88842b441280
> > [14639.741351] R10: 000f R11: 8883464d80c0 R12: 
> > dead0100
> > [14639.748477] R13: ea00 R14: 88842d5f3ff0 R15: 
> > ea000d193608
> > [14639.755604] FS:  7fd3b7e8f040() GS:88842d5c() 
> > knlGS:
> > [14639.763692] CS:  0010 DS:  ES:  CR0: 80050033
> > [14639.7

General protection fault: RIP: 0010:free_block+0xdc/0x1f0

2020-09-15 Thread Paul Menzel

Dear Andrew folks, dear Linux folks,


With Linux 5.9-rc4 on a Dell OptiPlex 5080 with Intel Core i7-10700 CPU 
@ 2.90GHz, and external


01:00.0 VGA compatible controller [0300]: Advanced Micro Devices, 
Inc. [AMD/ATI] Oland [Radeon HD 8570 / R7 240/340 OEM] [1002:6611] (rev 87)


running graphical demanding applications glmark2 [1] and the Phoronix 
Test Suite [2] benchmark *pts/desktop-graphics* [3]


$ git describe --tags
v10.0.0m1-13-g0b5ddc3c0

I got three general protection faults, and it restarted or froze (no 
input devices working, screen froze and even network card (no ping)).


Here the system restarted itself:


kernel: general protection fault, probably for non-canonical address 
0xdead0100:  [#1] SMP NOPTI
kernel: CPU: 2 PID: 9702 Comm: glmark2 Kdump: loaded Not tainted 
5.9.0-rc4.mx64.343 #1
kernel: Hardware name: Dell Inc. OptiPlex 5080/032W55, BIOS 1.1.7 08/17/2020
kernel: RIP: 0010:free_block+0xdc/0x1f0


Here it froze:


[14639.665745] general protection fault, probably for non-canonical address 
0xdead0100:  [#1] SMP NOPTI
[14639.675917] CPU: 15 PID: 23094 Comm: pvpython Kdump: loaded Not tainted 
5.9.0-rc4.mx64.343 #1
[14639.684431] Hardware name: Dell Inc. OptiPlex 5080/032W55, BIOS 1.1.7 
08/17/2020
[14639.691823] RIP: 0010:free_block+0xdc/0x1f0


Here it froze:


kernel: general protection fault, probably for non-canonical address 
0xdead0100:  [#1] SMP NOPTI
kernel: CPU: 15 PID: 23094 Comm: pvpython Kdump: loaded Not tainted 
5.9.0-rc4.mx64.343 #1
kernel: Hardware name: Dell Inc. OptiPlex 5080/032W55, BIOS 1.1.7 08/17/2020
kernel: RIP: 0010:free_block+0xdc/0x1f0


Running `scripts/decode_stacktrace.sh`:


linux-5.9_rc4-343.x86_64/source$ scripts/decode_stacktrace.sh vmlinux < 
optiplex-5080-linux-5.9-rc4-gp-pvpython.txt
[14528.718656] cgroup: fork rejected by pids controller in 
/user.slice/user-5272.slice/session-c6.scope
[14639.665745] general protection fault, probably for non-canonical address 
0xdead0100:  [#1] SMP NOPTI
[14639.675917] CPU: 15 PID: 23094 Comm: pvpython Kdump: loaded Not tainted 
5.9.0-rc4.mx64.343 #1
[14639.684431] Hardware name: Dell Inc. OptiPlex 5080/032W55, BIOS 1.1.7 
08/17/2020
[14639.691823] RIP: 0010:free_block (./include/linux/list.h:112 ./include/linux/list.h:135 ./include/linux/list.h:146 mm/slab.c:3336) 
[14639.696006] Code: 00 48 01 d0 48 c1 e8 0c 48 c1 e0 06 4c 01 e8 48 8b 50 08 48 8d 4a ff 83 e2 01 48 0f 45 c1 48 8b 48 08 48 8b 50 10 4c 8d 78 08 <48> 89 51 08 48 89 0a 4c 89 da 48 2b 50 28 4c 89 60 08 48 89 68 10

All code

   0:   00 48 01add%cl,0x1(%rax)
   3:   d0 48 c1rorb   -0x3f(%rax)
   6:   e8 0c 48 c1 e0  callq  0xe0c14817
   b:	06   	(bad)  
   c:	4c 01 e8 	add%r13,%rax

   f:   48 8b 50 08 mov0x8(%rax),%rdx
  13:   48 8d 4a ff lea-0x1(%rdx),%rcx
  17:   83 e2 01and$0x1,%edx
  1a:   48 0f 45 c1 cmovne %rcx,%rax
  1e:   48 8b 48 08 mov0x8(%rax),%rcx
  22:   48 8b 50 10 mov0x10(%rax),%rdx
  26:   4c 8d 78 08 lea0x8(%rax),%r15
  2a:*  48 89 51 08 mov%rdx,0x8(%rcx)   <-- trapping 
instruction
  2e:   48 89 0amov%rcx,(%rdx)
  31:   4c 89 damov%r11,%rdx
  34:   48 2b 50 28 sub0x28(%rax),%rdx
  38:   4c 89 60 08 mov%r12,0x8(%rax)
  3c:   48 89 68 10 mov%rbp,0x10(%rax)

Code starting with the faulting instruction
===
   0:   48 89 51 08 mov%rdx,0x8(%rcx)
   4:   48 89 0amov%rcx,(%rdx)
   7:   4c 89 damov%r11,%rdx
   a:   48 2b 50 28 sub0x28(%rax),%rdx
   e:   4c 89 60 08 mov%r12,0x8(%rax)
  12:   48 89 68 10 mov%rbp,0x10(%rax)
[14639.714747] RSP: 0018:c9001c26fab8 EFLAGS: 00010046
[14639.719970] RAX: ea000d193600 RBX: 8000 RCX: dead0100
[14639.727099] RDX: dead0122 RSI: 88842d5f3ef0 RDI: 88842b440300
[14639.734225] RBP: dead0122 R08: c9001c26fb30 R09: 88842b441280
[14639.741351] R10: 000f R11: 8883464d80c0 R12: dead0100
[14639.748477] R13: ea00 R14: 88842d5f3ff0 R15: ea000d193608
[14639.755604] FS:  7fd3b7e8f040() GS:88842d5c() 
knlGS:
[14639.763692] CS:  0010 DS:  ES:  CR0: 80050033
[14639.769430] CR2: 7fd344233548 CR3: 0002f46aa003 CR4: 007706e0
[14639.776556] PKRU: 5554
[14639.779265] Call Trace:
[14639.781717] ___cache_free (mm/slab.c:3389 mm/slab.c:3455) 
[14639.785463] kfree (./arch/x86/include/asm/irqflags.h:41 ./arch/x86/include/asm/irqflags.h:84 mm/slab.c:3757) 
[14639.788432] kmem_freepages (mm/slab.h:266 mm/slab.h:437 mm/slab.c:1406) 
[14639.792093]