Re: [PATCH 4.19.y] drm/vkms: call drm_atomic_helper_shutdown before drm_dev_put()
在 2024/4/5 17:30, Greg KH 写道: On Wed, Apr 03, 2024 at 05:47:16PM +0800, Guo Mengqi wrote: commit 73a82b22963d ("drm/atomic: Fix potential use-after-free in nonblocking commits") introduced drm_dev_get/put() to drm_atomic_helper_shutdown(). And this cause problem in vkms driver exit process. vkms_exit() drm_dev_put() vkms_release() drm_atomic_helper_shutdown() drm_dev_get() drm_dev_put() vkms_release()-- null pointer access Using 4.19 stable x86 image on qemu, below stacktrace can be triggered by load and unload vkms.ko. root:~ # insmod vkms.ko [ 142.135449] [drm] Supports vblank timestamp caching Rev 2 (21.10.2013). [ 142.138713] [drm] Driver supports precise vblank timestamp query. [ 142.142390] [drm] Initialized vkms 1.0.0 20180514 for virtual device on minor 0 root:~ # rmmod vkms.ko [ 144.093710] BUG: unable to handle kernel NULL pointer dereference at 00a0 [ 144.097491] PGD 80023624e067 P4D 80023624e067 PUD 22ab59067 PMD 0 [ 144.100802] Oops: [#1] SMP PTI [ 144.102502] CPU: 0 PID: 3615 Comm: rmmod Not tainted 4.19.310 #1 [ 144.104452] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 [ 144.107238] RIP: 0010:device_del+0x34/0x3a0 ... [ 144.131323] Call Trace: [ 144.131962] ? __die+0x7d/0xc0 [ 144.132711] ? no_context+0x152/0x3b0 [ 144.133605] ? wake_up_q+0x70/0x70 [ 144.134436] ? __do_page_fault+0x342/0x4b0 [ 144.135445] ? __switch_to_asm+0x41/0x70 [ 144.136416] ? __switch_to_asm+0x35/0x70 [ 144.137366] ? page_fault+0x1e/0x30 [ 144.138214] ? __drm_atomic_state_free+0x51/0x60 [ 144.139331] ? device_del+0x34/0x3a0 [ 144.140197] platform_device_del.part.14+0x19/0x70 [ 144.141348] platform_device_unregister+0xe/0x20 [ 144.142458] vkms_release+0x10/0x30 [vkms] [ 144.143449] __drm_atomic_helper_disable_all.constprop.31+0x13b/0x150 [ 144.144980] drm_atomic_helper_shutdown+0x4b/0x90 [ 144.146102] vkms_release+0x18/0x30 [vkms] [ 144.147107] vkms_exit+0x29/0x8ec [vkms] [ 144.148053] __x64_sys_delete_module+0x155/0x220 [ 144.149168] do_syscall_64+0x43/0x100 [ 144.150056] entry_SYSCALL_64_after_hwframe+0x5c/0xc1 It seems that the proper unload sequence is: drm_atomic_helper_shutdown(); drm_dev_put(); Just put drm_atomic_helper_shutdown() before drm_dev_put() should solve the problem. Note that vkms exit code is refactored by 53d77aaa3f76 ("drm/vkms: Use devm_drm_dev_alloc") in tags/v5.10-rc1. So this bug only exists on 4.19 and 5.4. Do we also need this for 5.4? If so, can you send a version for that tree with the correct Fixes: information, and I will be glad to queue both of these up. I sent a patch to 5.4.y too. Please check it at https://lore.kernel.org/all/20240409022647.1821-1-guomeng...@huawei.com/T/#u Thanks, Mengqi thanks, greg k-h .
Re: [PATCH] drm/vkms: call drm_atomic_helper_shutdown before drm_dev_put()
在 2024/3/29 17:57, Greg KH 写道: On Thu, Mar 21, 2024 at 03:55:37PM +0800, guomengqi (A) wrote: 在 2024/3/21 15:39, Greg KH 写道: On Thu, Mar 21, 2024 at 03:07:52PM +0800, Guo Mengqi wrote: commit 73a82b22963d ("drm/atomic: Fix potential use-after-free in nonblocking commits") introduced drm_dev_get/put() to drm_atomic_helper_shutdown(). And this cause problem in vkms driver exit process. vkms_exit() drm_dev_put() vkms_release() drm_atomic_helper_shutdown() drm_dev_get() drm_dev_put() vkms_release()-- null pointer access Using 4.19 stable x86 image on qemu, below stacktrace can be triggered by load and unload vkms.ko. root:~ # insmod vkms.ko [ 142.135449] [drm] Supports vblank timestamp caching Rev 2 (21.10.2013). [ 142.138713] [drm] Driver supports precise vblank timestamp query. [ 142.142390] [drm] Initialized vkms 1.0.0 20180514 for virtual device on minor 0 root:~ # rmmod vkms.ko [ 144.093710] BUG: unable to handle kernel NULL pointer dereference at 00a0 [ 144.097491] PGD 80023624e067 P4D 80023624e067 PUD 22ab59067 PMD 0 [ 144.100802] Oops: [#1] SMP PTI [ 144.102502] CPU: 0 PID: 3615 Comm: rmmod Not tainted 4.19.310 #1 [ 144.104452] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 [ 144.107238] RIP: 0010:device_del+0x34/0x3a0 ... [ 144.131323] Call Trace: [ 144.131962] ? __die+0x7d/0xc0 [ 144.132711] ? no_context+0x152/0x3b0 [ 144.133605] ? wake_up_q+0x70/0x70 [ 144.134436] ? __do_page_fault+0x342/0x4b0 [ 144.135445] ? __switch_to_asm+0x41/0x70 [ 144.136416] ? __switch_to_asm+0x35/0x70 [ 144.137366] ? page_fault+0x1e/0x30 [ 144.138214] ? __drm_atomic_state_free+0x51/0x60 [ 144.139331] ? device_del+0x34/0x3a0 [ 144.140197] platform_device_del.part.14+0x19/0x70 [ 144.141348] platform_device_unregister+0xe/0x20 [ 144.142458] vkms_release+0x10/0x30 [vkms] [ 144.143449] __drm_atomic_helper_disable_all.constprop.31+0x13b/0x150 [ 144.144980] drm_atomic_helper_shutdown+0x4b/0x90 [ 144.146102] vkms_release+0x18/0x30 [vkms] [ 144.147107] vkms_exit+0x29/0x8ec [vkms] [ 144.148053] __x64_sys_delete_module+0x155/0x220 [ 144.149168] do_syscall_64+0x43/0x100 [ 144.150056] entry_SYSCALL_64_after_hwframe+0x5c/0xc1 It seems that the proper unload sequence is: drm_atomic_helper_shutdown(); drm_dev_put(); Just put drm_atomic_helper_shutdown() before drm_dev_put() should solve the problem. Fixes: 73a82b22963d ("drm/atomic: Fix potential use-after-free in nonblocking commits") Signed-off-by: Guo Mengqi --- drivers/gpu/drm/vkms/vkms_drv.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/vkms/vkms_drv.c b/drivers/gpu/drm/vkms/vkms_drv.c index b1201c18d3eb..d32e08f17427 100644 --- a/drivers/gpu/drm/vkms/vkms_drv.c +++ b/drivers/gpu/drm/vkms/vkms_drv.c @@ -39,7 +39,6 @@ static void vkms_release(struct drm_device *dev) struct vkms_device *vkms = container_of(dev, struct vkms_device, drm); platform_device_unregister(vkms->platform); - drm_atomic_helper_shutdown(>drm); drm_mode_config_cleanup(>drm); drm_dev_fini(>drm); } @@ -137,6 +136,7 @@ static void __exit vkms_exit(void) } drm_dev_unregister(_device->drm); + drm_atomic_helper_shutdown(_device->drm); drm_dev_put(_device->drm); kfree(vkms_device); -- 2.17.1 What is the commit id of this change in Linus's tree? Hi, Do you mean this patch? I only send it to stable tree, mainline does not have this bug. vkms exit code is refactored by 53d77aaa3f76 ("drm/vkms: Use devm_drm_dev_alloc") in tags/v5.10-rc1. So this bug only exists on 4.19 and 5.4. Then you need to really really really document that in the changelog, and say what kernel tree(s) you want it applied to, AND get the review of the maintainers of the subsystem/driver you are wanting this applied to. Please fix up and resend. Hi Greg, I resent a new one and put that information both in title and commit log. Thanks for your reminder! thanks, greg k-h .
Re: [PATCH] drm/vkms: call drm_atomic_helper_shutdown before drm_dev_put()
在 2024/3/21 15:39, Greg KH 写道: On Thu, Mar 21, 2024 at 03:07:52PM +0800, Guo Mengqi wrote: commit 73a82b22963d ("drm/atomic: Fix potential use-after-free in nonblocking commits") introduced drm_dev_get/put() to drm_atomic_helper_shutdown(). And this cause problem in vkms driver exit process. vkms_exit() drm_dev_put() vkms_release() drm_atomic_helper_shutdown() drm_dev_get() drm_dev_put() vkms_release()-- null pointer access Using 4.19 stable x86 image on qemu, below stacktrace can be triggered by load and unload vkms.ko. root:~ # insmod vkms.ko [ 142.135449] [drm] Supports vblank timestamp caching Rev 2 (21.10.2013). [ 142.138713] [drm] Driver supports precise vblank timestamp query. [ 142.142390] [drm] Initialized vkms 1.0.0 20180514 for virtual device on minor 0 root:~ # rmmod vkms.ko [ 144.093710] BUG: unable to handle kernel NULL pointer dereference at 00a0 [ 144.097491] PGD 80023624e067 P4D 80023624e067 PUD 22ab59067 PMD 0 [ 144.100802] Oops: [#1] SMP PTI [ 144.102502] CPU: 0 PID: 3615 Comm: rmmod Not tainted 4.19.310 #1 [ 144.104452] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 [ 144.107238] RIP: 0010:device_del+0x34/0x3a0 ... [ 144.131323] Call Trace: [ 144.131962] ? __die+0x7d/0xc0 [ 144.132711] ? no_context+0x152/0x3b0 [ 144.133605] ? wake_up_q+0x70/0x70 [ 144.134436] ? __do_page_fault+0x342/0x4b0 [ 144.135445] ? __switch_to_asm+0x41/0x70 [ 144.136416] ? __switch_to_asm+0x35/0x70 [ 144.137366] ? page_fault+0x1e/0x30 [ 144.138214] ? __drm_atomic_state_free+0x51/0x60 [ 144.139331] ? device_del+0x34/0x3a0 [ 144.140197] platform_device_del.part.14+0x19/0x70 [ 144.141348] platform_device_unregister+0xe/0x20 [ 144.142458] vkms_release+0x10/0x30 [vkms] [ 144.143449] __drm_atomic_helper_disable_all.constprop.31+0x13b/0x150 [ 144.144980] drm_atomic_helper_shutdown+0x4b/0x90 [ 144.146102] vkms_release+0x18/0x30 [vkms] [ 144.147107] vkms_exit+0x29/0x8ec [vkms] [ 144.148053] __x64_sys_delete_module+0x155/0x220 [ 144.149168] do_syscall_64+0x43/0x100 [ 144.150056] entry_SYSCALL_64_after_hwframe+0x5c/0xc1 It seems that the proper unload sequence is: drm_atomic_helper_shutdown(); drm_dev_put(); Just put drm_atomic_helper_shutdown() before drm_dev_put() should solve the problem. Fixes: 73a82b22963d ("drm/atomic: Fix potential use-after-free in nonblocking commits") Signed-off-by: Guo Mengqi --- drivers/gpu/drm/vkms/vkms_drv.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/vkms/vkms_drv.c b/drivers/gpu/drm/vkms/vkms_drv.c index b1201c18d3eb..d32e08f17427 100644 --- a/drivers/gpu/drm/vkms/vkms_drv.c +++ b/drivers/gpu/drm/vkms/vkms_drv.c @@ -39,7 +39,6 @@ static void vkms_release(struct drm_device *dev) struct vkms_device *vkms = container_of(dev, struct vkms_device, drm); platform_device_unregister(vkms->platform); - drm_atomic_helper_shutdown(>drm); drm_mode_config_cleanup(>drm); drm_dev_fini(>drm); } @@ -137,6 +136,7 @@ static void __exit vkms_exit(void) } drm_dev_unregister(_device->drm); + drm_atomic_helper_shutdown(_device->drm); drm_dev_put(_device->drm); kfree(vkms_device); -- 2.17.1 What is the commit id of this change in Linus's tree? Hi, Do you mean this patch? I only send it to stable tree, mainline does not have this bug. vkms exit code is refactored by 53d77aaa3f76 ("drm/vkms: Use devm_drm_dev_alloc") in tags/v5.10-rc1. So this bug only exists on 4.19 and 5.4. thanks, greg k-h .