Linux 4.8-rc1 Cirrus QEMU crashes on boot.
Linus, Eric, On Mon, 8 Aug 2016 14:19:23 -0700 Linus Torvalds wrote: > On Mon, Aug 8, 2016 at 1:24 PM, Eric W. Biederman > wrote: > > > > Booting up v4.8-rc1 in qemu today I ran I ran into this beautiful oops. > > > > I am just about to head out the door on vacation until the end of the > > month so hopefully I am tossing out enough information to the right > > people. > > > > I have confirmed that disabling CONFIG_DRM_CIRRUS_QEMU avoids this > > crash. > > > > [0.720167] BUG: unable to handle kernel NULL pointer dereference at > > 0018 > > [0.721348] IP: [] drm_pick_crtcs+0xfc/0x270 > > [0.722249] PGD 0 > > [0.722659] Oops: [#1] SMP > > [0.723141] Modules linked in: > > [0.723643] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.8.0-rc1+ #116 > > [0.724602] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007 > > [0.725450] task: 8800bbf28000 task.stack: 8800bbf3 > > [0.726327] RIP: 0010:[] [] > > drm_pick_crtcs+0xfc/0x270 > > [0.727648] RSP: :8800bbf33978 EFLAGS: 00010217 > > [0.728444] RAX: 8d623a20 RBX: RCX: > > 8800baf3f860 > > [0.729498] RDX: RSI: 1000 RDI: > > 8800baf3f800 > > [0.730626] RBP: 8800bbf339f8 R08: R09: > > 0001 > > [0.731704] R10: 0001 R11: R12: > > 8800bbaf3af0 > > [0.732827] R13: 8800bbaf3af8 R14: R15: > > 8800baf3fc00 > > [0.733863] FS: () GS:8800bf80() > > knlGS: > > [0.735106] CS: 0010 DS: ES: CR0: 80050033 > > [0.735964] CR2: 0018 CR3: 3b219000 CR4: > > 06f0 > > [0.737021] Stack: > > [0.737342] 8800bbf33998 0246 8800bbaf3b18 > > 8800bbaf3b00 > > [0.738602] 1001 8800bbaf3b00 > > 8800baf3f800 > > [0.739807] 00031000 8800bbaf3b18 8800bbf339e8 > > 8800baf3fc00 > > [0.740993] Call Trace: > > [0.741393] [] drm_setup_crtcs+0x3bd/0xb50 > > [0.742252] [] ? mark_held_locks+0x71/0x90 > > [0.743176] [] ? __mutex_unlock_slowpath+0xeb/0x1c0 > > [0.744138] [] drm_fb_helper_initial_config+0x81/0x3c0 > > [0.745166] [] ? drm_modeset_unlock_all+0x54/0x80 > > [0.746136] [] cirrus_fbdev_init+0xa0/0xb0 > > [0.747033] [] cirrus_modeset_init+0x18b/0x1e0 > > .. snip snip .. > > [0.771023] Code: e8 ba e1 ff ff 48 8b 55 b8 48 83 f8 01 83 5d c4 ff 48 > > 8b 7d b8 48 8b 82 98 02 00 00 49 8b 57 08 48 8b 40 10 48 8b 92 00 0a 00 00 > > <48> 83 7a 18 00 74 09 48 85 c0 0f 84 53 01 00 00 ff d0 49 89 c3 > > This decodes to > >0: e8 ba e1 ff ff callq ... >5: 48 8b 55 b8 mov-0x48(%rbp),%rdx >9: 48 83 f8 01 cmp$0x1,%rax >d: 83 5d c4 ff sbbl $0x,-0x3c(%rbp) > 11: 48 8b 7d b8 mov-0x48(%rbp),%rdi > 15: 48 8b 82 98 02 00 00 mov0x298(%rdx),%rax > 1c: 49 8b 57 08 mov0x8(%r15),%rdx > 20: 48 8b 40 10 mov0x10(%rax),%rax > 24: 48 8b 92 00 0a 00 00 mov0xa00(%rdx),%rdx > 2b:* 48 83 7a 18 00 cmpq $0x0,0x18(%rdx) <-- trapping instruction > 30: 74 09 je 0x3b > 32: 48 85 c0 test %rax,%rax > 35: 0f 84 53 01 00 00 je 0x18e > 3b: ff d0 callq *%rax > 3d: 49 89 c3 mov%rax,%r11 > > > and trying to find matching code I think the oops is from this: > > if (fb_helper->dev->mode_config.funcs->atomic_commit && > !connector_funcs->best_encoder) > encoder = drm_atomic_helper_best_encoder(connector); > else > encoder = connector_funcs->best_encoder(connector); > > in particular, the trapping instruction is the load of "atomic_commit". > > (The "callq *%rax" seems to be the "connector_funcs->best_encoder()" call) > > So I *think* we have "fb_helper->dev->mode_config.funcs" being NULL. > > But I might have gotten the code matching wrong. I don't have the same > compiler as Eric, so even when using his config my code generation > doesn't really match. But there is only one indirect call in that > function, which is that ->best_encoder() call, so it does seem to > match up. > > Adding Boris Brezillon and Daniel Vetter to the cc, since assuming I'm > right, the main suspect is commit c61b93fe51b1 ("drm/atomic: Fix > remaining places where !funcs->best_encoder is valid"). > > Boris? Daniel? I think you're right, c61b93fe51b1 is likely to be the commit who introduced this bug. cirrus_driver_load() [1] assigns mode_config->funcs pointer to &cirrus_mode_funcs only after calling cirrus_modeset_init() (this function is taking care of initializing all DRM elements including the fbdev emulation layer), and commit c61b93fe51b1 introduced a test on mode_config-
Linux 4.8-rc1 Cirrus QEMU crashes on boot.
- Original Message - > From: "Linus Torvalds" > To: "Eric W. Biederman" , "Boris Brezillon" > , "Daniel > Vetter" > Cc: "Linux Kernel Mailing List" , "Dave > Airlie" , "David Airlie" > , "DRI" > Sent: Tuesday, 9 August, 2016 7:19:23 AM > Subject: Re: Linux 4.8-rc1 Cirrus QEMU crashes on boot. > > On Mon, Aug 8, 2016 at 1:24 PM, Eric W. Biederman > wrote: > > > > Booting up v4.8-rc1 in qemu today I ran I ran into this beautiful oops. > > > > I am just about to head out the door on vacation until the end of the > > month so hopefully I am tossing out enough information to the right > > people. > > > > I have confirmed that disabling CONFIG_DRM_CIRRUS_QEMU avoids this > > crash. > > > > [0.720167] BUG: unable to handle kernel NULL pointer dereference at > > 0018 > > [0.721348] IP: [] drm_pick_crtcs+0xfc/0x270 > > [0.722249] PGD 0 > > [0.722659] Oops: [#1] SMP > > [0.723141] Modules linked in: > > [0.723643] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.8.0-rc1+ #116 > > [0.724602] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007 > > [0.725450] task: 8800bbf28000 task.stack: 8800bbf3 > > [0.726327] RIP: 0010:[] [] > > drm_pick_crtcs+0xfc/0x270 > > [0.727648] RSP: :8800bbf33978 EFLAGS: 00010217 > > [0.728444] RAX: 8d623a20 RBX: RCX: > > 8800baf3f860 > > [0.729498] RDX: RSI: 1000 RDI: > > 8800baf3f800 > > [0.730626] RBP: 8800bbf339f8 R08: R09: > > 0001 > > [0.731704] R10: 0001 R11: R12: > > 8800bbaf3af0 > > [0.732827] R13: 8800bbaf3af8 R14: R15: > > 8800baf3fc00 > > [0.733863] FS: () GS:8800bf80() > > knlGS: > > [0.735106] CS: 0010 DS: ES: CR0: 80050033 > > [0.735964] CR2: 0018 CR3: 3b219000 CR4: > > 06f0 > > [0.737021] Stack: > > [0.737342] 8800bbf33998 0246 8800bbaf3b18 > > 8800bbaf3b00 > > [0.738602] 1001 8800bbaf3b00 > > 8800baf3f800 > > [0.739807] 00031000 8800bbaf3b18 8800bbf339e8 > > 8800baf3fc00 > > [0.740993] Call Trace: > > [0.741393] [] drm_setup_crtcs+0x3bd/0xb50 > > [0.742252] [] ? mark_held_locks+0x71/0x90 > > [0.743176] [] ? __mutex_unlock_slowpath+0xeb/0x1c0 > > [0.744138] [] > > drm_fb_helper_initial_config+0x81/0x3c0 > > [0.745166] [] ? drm_modeset_unlock_all+0x54/0x80 > > [0.746136] [] cirrus_fbdev_init+0xa0/0xb0 > > [0.747033] [] cirrus_modeset_init+0x18b/0x1e0 > > .. snip snip .. > > [0.771023] Code: e8 ba e1 ff ff 48 8b 55 b8 48 83 f8 01 83 5d c4 ff 48 > > 8b 7d b8 48 8b 82 98 02 00 00 49 8b 57 08 48 8b 40 10 48 8b 92 00 0a 00 00 > > <48> 83 7a 18 00 74 09 48 85 c0 0f 84 53 01 00 00 ff d0 49 89 c3 > > This decodes to > >0: e8 ba e1 ff ff callq ... >5: 48 8b 55 b8 mov-0x48(%rbp),%rdx >9: 48 83 f8 01 cmp$0x1,%rax >d: 83 5d c4 ff sbbl $0x,-0x3c(%rbp) > 11: 48 8b 7d b8 mov-0x48(%rbp),%rdi > 15: 48 8b 82 98 02 00 00 mov0x298(%rdx),%rax > 1c: 49 8b 57 08 mov0x8(%r15),%rdx > 20: 48 8b 40 10 mov0x10(%rax),%rax > 24: 48 8b 92 00 0a 00 00 mov0xa00(%rdx),%rdx > 2b:* 48 83 7a 18 00 cmpq $0x0,0x18(%rdx) <-- trapping instruction > 30: 74 09 je 0x3b > 32: 48 85 c0 test %rax,%rax > 35: 0f 84 53 01 00 00 je 0x18e > 3b: ff d0 callq *%rax > 3d: 49 89 c3 mov%rax,%r11 > > > and trying to find matching code I think the oops is from this: > > if (fb_helper->dev->mode_config.funcs->atomic_commit && > !connector_funcs->best_encoder) > encoder = drm_atomic_helper_best_encoder(connector); > else > encoder = connector_funcs->best_encoder(connector); > > in particular, the trapping instruction is the load of "atomic_commit". > > (The "callq *%rax" seems to be the "connector_funcs->best_encoder()" call) > > So I *think* we have "fb_helper->dev->mode_config.funcs" being NULL. > > But I
Linux 4.8-rc1 Cirrus QEMU crashes on boot.
On Mon, Aug 8, 2016 at 1:24 PM, Eric W. Biederman wrote: > > Booting up v4.8-rc1 in qemu today I ran I ran into this beautiful oops. > > I am just about to head out the door on vacation until the end of the > month so hopefully I am tossing out enough information to the right > people. > > I have confirmed that disabling CONFIG_DRM_CIRRUS_QEMU avoids this > crash. > > [0.720167] BUG: unable to handle kernel NULL pointer dereference at > 0018 > [0.721348] IP: [] drm_pick_crtcs+0xfc/0x270 > [0.722249] PGD 0 > [0.722659] Oops: [#1] SMP > [0.723141] Modules linked in: > [0.723643] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.8.0-rc1+ #116 > [0.724602] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007 > [0.725450] task: 8800bbf28000 task.stack: 8800bbf3 > [0.726327] RIP: 0010:[] [] > drm_pick_crtcs+0xfc/0x270 > [0.727648] RSP: :8800bbf33978 EFLAGS: 00010217 > [0.728444] RAX: 8d623a20 RBX: RCX: > 8800baf3f860 > [0.729498] RDX: RSI: 1000 RDI: > 8800baf3f800 > [0.730626] RBP: 8800bbf339f8 R08: R09: > 0001 > [0.731704] R10: 0001 R11: R12: > 8800bbaf3af0 > [0.732827] R13: 8800bbaf3af8 R14: R15: > 8800baf3fc00 > [0.733863] FS: () GS:8800bf80() > knlGS: > [0.735106] CS: 0010 DS: ES: CR0: 80050033 > [0.735964] CR2: 0018 CR3: 3b219000 CR4: > 06f0 > [0.737021] Stack: > [0.737342] 8800bbf33998 0246 8800bbaf3b18 > 8800bbaf3b00 > [0.738602] 1001 8800bbaf3b00 > 8800baf3f800 > [0.739807] 00031000 8800bbaf3b18 8800bbf339e8 > 8800baf3fc00 > [0.740993] Call Trace: > [0.741393] [] drm_setup_crtcs+0x3bd/0xb50 > [0.742252] [] ? mark_held_locks+0x71/0x90 > [0.743176] [] ? __mutex_unlock_slowpath+0xeb/0x1c0 > [0.744138] [] drm_fb_helper_initial_config+0x81/0x3c0 > [0.745166] [] ? drm_modeset_unlock_all+0x54/0x80 > [0.746136] [] cirrus_fbdev_init+0xa0/0xb0 > [0.747033] [] cirrus_modeset_init+0x18b/0x1e0 > .. snip snip .. > [0.771023] Code: e8 ba e1 ff ff 48 8b 55 b8 48 83 f8 01 83 5d c4 ff 48 8b > 7d b8 48 8b 82 98 02 00 00 49 8b 57 08 48 8b 40 10 48 8b 92 00 0a 00 00 <48> > 83 7a 18 00 74 09 48 85 c0 0f 84 53 01 00 00 ff d0 49 89 c3 This decodes to 0: e8 ba e1 ff ff callq ... 5: 48 8b 55 b8 mov-0x48(%rbp),%rdx 9: 48 83 f8 01 cmp$0x1,%rax d: 83 5d c4 ff sbbl $0x,-0x3c(%rbp) 11: 48 8b 7d b8 mov-0x48(%rbp),%rdi 15: 48 8b 82 98 02 00 00 mov0x298(%rdx),%rax 1c: 49 8b 57 08 mov0x8(%r15),%rdx 20: 48 8b 40 10 mov0x10(%rax),%rax 24: 48 8b 92 00 0a 00 00 mov0xa00(%rdx),%rdx 2b:* 48 83 7a 18 00 cmpq $0x0,0x18(%rdx) <-- trapping instruction 30: 74 09 je 0x3b 32: 48 85 c0 test %rax,%rax 35: 0f 84 53 01 00 00 je 0x18e 3b: ff d0 callq *%rax 3d: 49 89 c3 mov%rax,%r11 and trying to find matching code I think the oops is from this: if (fb_helper->dev->mode_config.funcs->atomic_commit && !connector_funcs->best_encoder) encoder = drm_atomic_helper_best_encoder(connector); else encoder = connector_funcs->best_encoder(connector); in particular, the trapping instruction is the load of "atomic_commit". (The "callq *%rax" seems to be the "connector_funcs->best_encoder()" call) So I *think* we have "fb_helper->dev->mode_config.funcs" being NULL. But I might have gotten the code matching wrong. I don't have the same compiler as Eric, so even when using his config my code generation doesn't really match. But there is only one indirect call in that function, which is that ->best_encoder() call, so it does seem to match up. Adding Boris Brezillon and Daniel Vetter to the cc, since assuming I'm right, the main suspect is commit c61b93fe51b1 ("drm/atomic: Fix remaining places where !funcs->best_encoder is valid"). Boris? Daniel? Linus