Re: amdgpu UBSAN warnings in 6.10.0-rc5

2024-07-01 Thread Alex Deucher
On Sun, Jun 30, 2024 at 8:40 AM Jeff Layton  wrote:
>
> I've been testing some vfs patches (multigrain timestamps) on my
> personal desktop with a 6.10.0-rc5-ish kernel, and have hit a number of
> warnings in the amdgpu driver, including a UBSAN warning that looks
> like a potential array overrun:
>
> [8.772608] [ cut here ]
> [8.772609] UBSAN: array-index-out-of-bounds in 
> drivers/gpu/drm/amd/amdgpu/../display/dc/bios/bios_parser2.c:680:23
> [8.772612] index 8 is out of range for type 'atom_gpio_pin_assignment [8]'
> [8.772614] CPU: 13 PID: 508 Comm: (udev-worker) Not tainted 
> 6.10.0-rc5-00292-gb3efd5c27332 #35
> [8.772616] Hardware name: Micro-Star International Co., Ltd. MS-7E27/PRO 
> B650M-P (MS-7E27), BIOS 1.A0 06/07/2024
> [8.772618] Call Trace:
> [8.772620]  
> [8.772621]  dump_stack_lvl+0x5d/0x80
> [8.772629]  ubsan_epilogue+0x5/0x30
> [8.772633]  __ubsan_handle_out_of_bounds.cold+0x46/0x4b
> [8.772636]  bios_parser_get_gpio_pin_info+0x11c/0x150 [amdgpu]
> [8.773016]  link_get_hpd_gpio+0x7e/0xd0 [amdgpu]
> [8.773205]  construct_phy+0x26d/0xd40 [amdgpu]
> [8.773355]  ? srso_alias_return_thunk+0x5/0xfbef5
> [8.773370]  ? link_create+0x210/0x250 [amdgpu]
> [8.773493]  ? srso_alias_return_thunk+0x5/0xfbef5
> [8.773495]  link_create+0x210/0x250 [amdgpu]
> [8.773610]  ? srso_alias_return_thunk+0x5/0xfbef5
> [8.773612]  create_links+0x151/0x530 [amdgpu]
> [8.773759]  dc_create+0x401/0x7b0 [amdgpu]
> [8.773883]  ? srso_alias_return_thunk+0x5/0xfbef5
> [8.773886]  amdgpu_dm_init.isra.0+0x32f/0x22d0 [amdgpu]
> [8.774045]  ? irq_work_queue+0x2d/0x50
> [8.774048]  ? srso_alias_return_thunk+0x5/0xfbef5
> [8.774050]  ? srso_alias_return_thunk+0x5/0xfbef5
> [8.774052]  ? vprintk_emit+0x176/0x2a0
> [8.774056]  ? dev_vprintk_emit+0x181/0x1b0
> [8.774063]  dm_hw_init+0x12/0x30 [amdgpu]
> [8.774187]  amdgpu_device_init.cold+0x1c43/0x1f90 [amdgpu]
> [8.774373]  amdgpu_driver_load_kms+0x19/0x70 [amdgpu]
> [8.774507]  amdgpu_pci_probe+0x1a7/0x4b0 [amdgpu]
> [8.774631]  local_pci_probe+0x42/0x90
> [8.774635]  pci_device_probe+0xc1/0x2a0
> [8.774638]  really_probe+0xdb/0x340
> [8.774642]  ? pm_runtime_barrier+0x54/0x90
> [8.774644]  ? __pfx___driver_attach+0x10/0x10
> [8.774646]  __driver_probe_device+0x78/0x110
> [8.774648]  driver_probe_device+0x1f/0xa0
> [8.774650]  __driver_attach+0xba/0x1c0
> [8.774652]  bus_for_each_dev+0x8c/0xe0
> [8.774655]  bus_add_driver+0x142/0x220
> [8.774657]  driver_register+0x72/0xd0
> [8.774660]  ? __pfx_amdgpu_init+0x10/0x10 [amdgpu]
> [8.774779]  do_one_initcall+0x58/0x310
> [8.774784]  do_init_module+0x90/0x250
> [8.774787]  init_module_from_file+0x86/0xc0
> [8.774791]  idempotent_init_module+0x121/0x2b0
> [8.774794]  __x64_sys_finit_module+0x5e/0xb0
> [8.774796]  do_syscall_64+0x82/0x160
> [8.774799]  ? __pfx_page_put_link+0x10/0x10
> [8.774804]  ? srso_alias_return_thunk+0x5/0xfbef5
> [8.774806]  ? do_sys_openat2+0x9c/0xe0
> [8.774809]  ? srso_alias_return_thunk+0x5/0xfbef5
> [8.774810]  ? syscall_exit_to_user_mode+0x72/0x220
> [8.774813]  ? srso_alias_return_thunk+0x5/0xfbef5
> [8.774815]  ? do_syscall_64+0x8e/0x160
> [8.774816]  ? srso_alias_return_thunk+0x5/0xfbef5
> [8.774818]  ? __seccomp_filter+0x303/0x520
> [8.774820]  ? srso_alias_return_thunk+0x5/0xfbef5
> [8.774824]  ? srso_alias_return_thunk+0x5/0xfbef5
> [8.774825]  ? syscall_exit_to_user_mode+0x72/0x220
> [8.774827]  ? srso_alias_return_thunk+0x5/0xfbef5
> [8.774829]  ? do_syscall_64+0x8e/0x160
> [8.774830]  ? do_syscall_64+0x8e/0x160
> [8.774831]  ? srso_alias_return_thunk+0x5/0xfbef5
> [8.774833]  ? srso_alias_return_thunk+0x5/0xfbef5
> [8.774835]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
> [8.774837] RIP: 0033:0x7fa5f44391bd
> [8.774848] Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 
> f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 
> 01 f0 ff ff 73 01 c3 48 8b 0d 2b cc 0c 00 f7 d8 64 89 01 48
> [8.774850] RSP: 002b:7fff5d55a5a8 EFLAGS: 0246 ORIG_RAX: 
> 0139
> [8.774852] RAX: ffda RBX: 555b3bfe6a50 RCX: 
> 7fa5f44391bd
> [8.774854] RDX:  RSI: 7fa5f455507d RDI: 
> 002c
> [8.774855] RBP: 7fff5d55a660 R08: 0001 R09: 
> 7fff5d55a5f0
> [8.774855] R10: 0050 R11: 0246 R12: 
> 7fa5f455507d
> [8.774856] R13: 0002 R14: 555b3bfebb30 R15: 
> 555b3bff63d0
> [8.774859]  
> [8.774864] ---[ end trace ]---
>
>
> It looks like "count" probably needs to be clamped to
> ARRAY_SIZE(header->gpio_pin) in bios_parser_get_gpio_pin_info ?
>
> dmesg is attached. There are couple of other warnings in there too
> after the UBSAN 

amdgpu UBSAN warnings in 6.10.0-rc5

2024-06-30 Thread Jeff Layton
I've been testing some vfs patches (multigrain timestamps) on my
personal desktop with a 6.10.0-rc5-ish kernel, and have hit a number of
warnings in the amdgpu driver, including a UBSAN warning that looks
like a potential array overrun:

[8.772608] [ cut here ]
[8.772609] UBSAN: array-index-out-of-bounds in 
drivers/gpu/drm/amd/amdgpu/../display/dc/bios/bios_parser2.c:680:23
[8.772612] index 8 is out of range for type 'atom_gpio_pin_assignment [8]'
[8.772614] CPU: 13 PID: 508 Comm: (udev-worker) Not tainted 
6.10.0-rc5-00292-gb3efd5c27332 #35
[8.772616] Hardware name: Micro-Star International Co., Ltd. MS-7E27/PRO 
B650M-P (MS-7E27), BIOS 1.A0 06/07/2024
[8.772618] Call Trace:
[8.772620]  
[8.772621]  dump_stack_lvl+0x5d/0x80
[8.772629]  ubsan_epilogue+0x5/0x30
[8.772633]  __ubsan_handle_out_of_bounds.cold+0x46/0x4b
[8.772636]  bios_parser_get_gpio_pin_info+0x11c/0x150 [amdgpu]
[8.773016]  link_get_hpd_gpio+0x7e/0xd0 [amdgpu]
[8.773205]  construct_phy+0x26d/0xd40 [amdgpu]
[8.773355]  ? srso_alias_return_thunk+0x5/0xfbef5
[8.773370]  ? link_create+0x210/0x250 [amdgpu]
[8.773493]  ? srso_alias_return_thunk+0x5/0xfbef5
[8.773495]  link_create+0x210/0x250 [amdgpu]
[8.773610]  ? srso_alias_return_thunk+0x5/0xfbef5
[8.773612]  create_links+0x151/0x530 [amdgpu]
[8.773759]  dc_create+0x401/0x7b0 [amdgpu]
[8.773883]  ? srso_alias_return_thunk+0x5/0xfbef5
[8.773886]  amdgpu_dm_init.isra.0+0x32f/0x22d0 [amdgpu]
[8.774045]  ? irq_work_queue+0x2d/0x50
[8.774048]  ? srso_alias_return_thunk+0x5/0xfbef5
[8.774050]  ? srso_alias_return_thunk+0x5/0xfbef5
[8.774052]  ? vprintk_emit+0x176/0x2a0
[8.774056]  ? dev_vprintk_emit+0x181/0x1b0
[8.774063]  dm_hw_init+0x12/0x30 [amdgpu]
[8.774187]  amdgpu_device_init.cold+0x1c43/0x1f90 [amdgpu]
[8.774373]  amdgpu_driver_load_kms+0x19/0x70 [amdgpu]
[8.774507]  amdgpu_pci_probe+0x1a7/0x4b0 [amdgpu]
[8.774631]  local_pci_probe+0x42/0x90
[8.774635]  pci_device_probe+0xc1/0x2a0
[8.774638]  really_probe+0xdb/0x340
[8.774642]  ? pm_runtime_barrier+0x54/0x90
[8.774644]  ? __pfx___driver_attach+0x10/0x10
[8.774646]  __driver_probe_device+0x78/0x110
[8.774648]  driver_probe_device+0x1f/0xa0
[8.774650]  __driver_attach+0xba/0x1c0
[8.774652]  bus_for_each_dev+0x8c/0xe0
[8.774655]  bus_add_driver+0x142/0x220
[8.774657]  driver_register+0x72/0xd0
[8.774660]  ? __pfx_amdgpu_init+0x10/0x10 [amdgpu]
[8.774779]  do_one_initcall+0x58/0x310
[8.774784]  do_init_module+0x90/0x250
[8.774787]  init_module_from_file+0x86/0xc0
[8.774791]  idempotent_init_module+0x121/0x2b0
[8.774794]  __x64_sys_finit_module+0x5e/0xb0
[8.774796]  do_syscall_64+0x82/0x160
[8.774799]  ? __pfx_page_put_link+0x10/0x10
[8.774804]  ? srso_alias_return_thunk+0x5/0xfbef5
[8.774806]  ? do_sys_openat2+0x9c/0xe0
[8.774809]  ? srso_alias_return_thunk+0x5/0xfbef5
[8.774810]  ? syscall_exit_to_user_mode+0x72/0x220
[8.774813]  ? srso_alias_return_thunk+0x5/0xfbef5
[8.774815]  ? do_syscall_64+0x8e/0x160
[8.774816]  ? srso_alias_return_thunk+0x5/0xfbef5
[8.774818]  ? __seccomp_filter+0x303/0x520
[8.774820]  ? srso_alias_return_thunk+0x5/0xfbef5
[8.774824]  ? srso_alias_return_thunk+0x5/0xfbef5
[8.774825]  ? syscall_exit_to_user_mode+0x72/0x220
[8.774827]  ? srso_alias_return_thunk+0x5/0xfbef5
[8.774829]  ? do_syscall_64+0x8e/0x160
[8.774830]  ? do_syscall_64+0x8e/0x160
[8.774831]  ? srso_alias_return_thunk+0x5/0xfbef5
[8.774833]  ? srso_alias_return_thunk+0x5/0xfbef5
[8.774835]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
[8.774837] RIP: 0033:0x7fa5f44391bd
[8.774848] Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 
f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 
f0 ff ff 73 01 c3 48 8b 0d 2b cc 0c 00 f7 d8 64 89 01 48
[8.774850] RSP: 002b:7fff5d55a5a8 EFLAGS: 0246 ORIG_RAX: 
0139
[8.774852] RAX: ffda RBX: 555b3bfe6a50 RCX: 7fa5f44391bd
[8.774854] RDX:  RSI: 7fa5f455507d RDI: 002c
[8.774855] RBP: 7fff5d55a660 R08: 0001 R09: 7fff5d55a5f0
[8.774855] R10: 0050 R11: 0246 R12: 7fa5f455507d
[8.774856] R13: 0002 R14: 555b3bfebb30 R15: 555b3bff63d0
[8.774859]  
[8.774864] ---[ end trace ]---


It looks like "count" probably needs to be clamped to
ARRAY_SIZE(header->gpio_pin) in bios_parser_get_gpio_pin_info ?

dmesg is attached. There are couple of other warnings in there too
after the UBSAN one, but this one looks the most worrisome.
-- 
Jeff Layton 


amd-warnings-dmesg.out.gz
Description: application/gzip