This reverts commit 6bee098b91417654703e17eb5c1822c6dfd0c01d.
Den 2026-03-25 kl. 22:11, skrev Simona Vetter:
> On Wed, Mar 25, 2026 at 10:26:40AM -0700, Guenter Roeck wrote:
>> Hi,
>>
>> On Fri, Mar 13, 2026 at 04:17:27PM +0100, Maarten Lankhorst wrote:
>>> When trying to do a rather aggressive test of igt's "xe_module_load
>>> --r reload" with a full desktop environment and game running I noticed
>>> a few OOPSes when dereferencing freed pointers, related to
>>> framebuffers and property blobs after the compositor exits.
>>>
>>> Solve this by guarding the freeing in drm_file with drm_dev_enter/exit,
>>> and immediately put the references from struct drm_file objects during
>>> drm_dev_unplug().
>>>
>>
>> With this patch in v6.18.20, I get the warning backtraces below.
>> The backtraces are gone with the patch reverted.
>
> Yeah, this needs to be reverted, reasoning below. Maarten, can you please
> take care of that and feed the revert through the usual channels? I don't
> think it's critical enough that we need to fast-track this into drm.git
> directly.
>
> Quoting the patch here again:
>
>> drivers/gpu/drm/drm_file.c | 5 ++++-
>> drivers/gpu/drm/drm_mode_config.c | 9 ++++++---
>> 2 files changed, 10 insertions(+), 4 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/drm_file.c b/drivers/gpu/drm/drm_file.c
>> index ec820686b3021..f52141f842a1f 100644
>> --- a/drivers/gpu/drm/drm_file.c
>> +++ b/drivers/gpu/drm/drm_file.c
>> @@ -233,6 +233,7 @@ static void drm_events_release(struct drm_file
>> *file_priv)
>> void drm_file_free(struct drm_file *file)
>> {
>> struct drm_device *dev;
>> + int idx;
>>
>> if (!file)
>> return;
>> @@ -249,9 +250,11 @@ void drm_file_free(struct drm_file *file)
>>
>> drm_events_release(file);
>>
>> - if (drm_core_check_feature(dev, DRIVER_MODESET)) {
>> + if (drm_core_check_feature(dev, DRIVER_MODESET) &&
>> + drm_dev_enter(dev, &idx)) {
>
> This is misplaced for two reasons:
>
> - Even if we'd want to guarantee that we hold a drm_dev_enter/exit
> reference during framebuffer teardown, we'd need to do this
> _consistently over all callsites. Not ad-hoc in just one place that a
> testcase hits. This also means kerneldoc updates of the relevant hooks
> and at least a bunch of acks from other driver people to document the
> consensus.
>
> - More importantly, this is driver responsibilities in general unless we
> have extremely good reasons to the contrary. Which means this must be
> placed in xe.
>
>> drm_fb_release(file);
>> drm_property_destroy_user_blobs(dev, file);
>> + drm_dev_exit(idx);
>> }
>>
>> if (drm_core_check_feature(dev, DRIVER_SYNCOBJ))
>> diff --git a/drivers/gpu/drm/drm_mode_config.c
>> b/drivers/gpu/drm/drm_mode_config.c
>> index 84ae8a23a3678..e349418978f79 100644
>> --- a/drivers/gpu/drm/drm_mode_config.c
>> +++ b/drivers/gpu/drm/drm_mode_config.c
>> @@ -583,10 +583,13 @@ void drm_mode_config_cleanup(struct drm_device *dev)
>> */
>> WARN_ON(!list_empty(&dev->mode_config.fb_list));
>> list_for_each_entry_safe(fb, fbt, &dev->mode_config.fb_list, head) {
>> - struct drm_printer p = drm_dbg_printer(dev, DRM_UT_KMS,
>> "[leaked fb]");
>> + if (list_empty(&fb->filp_head) ||
>> drm_framebuffer_read_refcount(fb) > 1) {
>> + struct drm_printer p = drm_dbg_printer(dev, DRM_UT_KMS,
>> "[leaked fb]");
>
> This is also wrong:
>
> - Firstly, it's a completely independent bug, we do not smash two bugfixes
> into one patch.
>
> - Secondly, it's again a driver bug: drm_mode_cleanup must be called when
> the last drm_device reference disappears (hence the existence of
> drmm_mode_config_init), not when the driver gets unbound. The fact that
> this shows up in a callchain from a devres cleanup means the intel
> driver gets this wrong (like almost everyone else because historically
> we didn't know better).
>
> If we don't follow this rule, then we get races with this code here
> running concurrently with drm_file fb cleanups, which just does not
> work. Review pointed that out, but then shrugged it off with a confused
> explanation:
>
>
> https://lore.kernel.org/all/[email protected]/
>
> Yes this also means a lot of the other drm_device teardown that drivers
> do happens way too early. There is a massive can of worms here of a
> magnitude that most likely is much, much bigger than what you can
> backport to stable kernels. Hotunplug is _hard_.
Back to the drawing board, and fixing it in the intel display driver
instead.
Cc: Thomas Hellström <[email protected]>
Reported-by: Guenter Roeck <[email protected]>
Acked-by: Simona Vetter <[email protected]>
Signed-off-by: Maarten Lankhorst <[email protected]>
Fixes: 6bee098b9141 ("drm: Fix use-after-free on framebuffers and property
blobs when calling drm_dev_unplug")
---
drivers/gpu/drm/drm_file.c | 5 +----
drivers/gpu/drm/drm_mode_config.c | 9 +++------
2 files changed, 4 insertions(+), 10 deletions(-)
diff --git a/drivers/gpu/drm/drm_file.c b/drivers/gpu/drm/drm_file.c
index f52141f842a1f..ec820686b3021 100644
--- a/drivers/gpu/drm/drm_file.c
+++ b/drivers/gpu/drm/drm_file.c
@@ -233,7 +233,6 @@ static void drm_events_release(struct drm_file *file_priv)
void drm_file_free(struct drm_file *file)
{
struct drm_device *dev;
- int idx;
if (!file)
return;
@@ -250,11 +249,9 @@ void drm_file_free(struct drm_file *file)
drm_events_release(file);
- if (drm_core_check_feature(dev, DRIVER_MODESET) &&
- drm_dev_enter(dev, &idx)) {
+ if (drm_core_check_feature(dev, DRIVER_MODESET)) {
drm_fb_release(file);
drm_property_destroy_user_blobs(dev, file);
- drm_dev_exit(idx);
}
if (drm_core_check_feature(dev, DRIVER_SYNCOBJ))
diff --git a/drivers/gpu/drm/drm_mode_config.c
b/drivers/gpu/drm/drm_mode_config.c
index 802bc4608abf5..d12db9b0bab81 100644
--- a/drivers/gpu/drm/drm_mode_config.c
+++ b/drivers/gpu/drm/drm_mode_config.c
@@ -577,13 +577,10 @@ void drm_mode_config_cleanup(struct drm_device *dev)
*/
WARN_ON(!list_empty(&dev->mode_config.fb_list));
list_for_each_entry_safe(fb, fbt, &dev->mode_config.fb_list, head) {
- if (list_empty(&fb->filp_head) ||
drm_framebuffer_read_refcount(fb) > 1) {
- struct drm_printer p = drm_dbg_printer(dev, DRM_UT_KMS,
"[leaked fb]");
+ struct drm_printer p = drm_dbg_printer(dev, DRM_UT_KMS,
"[leaked fb]");
- drm_printf(&p, "framebuffer[%u]:\n", fb->base.id);
- drm_framebuffer_print_info(&p, 1, fb);
- }
- list_del_init(&fb->filp_head);
+ drm_printf(&p, "framebuffer[%u]:\n", fb->base.id);
+ drm_framebuffer_print_info(&p, 1, fb);
drm_framebuffer_free(&fb->base.refcount);
}
--
2.51.0