Re: [PATCH 1/2] drm: introduce page_flip_timeout()

Mario Limonciello Fri, 23 Jan 2026 14:30:29 -0800

On 1/23/2026 8:44 AM, Timur Kristóf wrote:

On Friday, January 23, 2026 2:52:44 PM Central European Standard Time
Christian König wrote:

On 1/23/26 01:05, Hamza Mahfooz wrote:

There should be a mechanism for drivers to respond to flip_done
time outs.

When there is a display hang, I think that resetting the GPU IP isreally heavy handed. I second what Alex said - Why not instead justreset DCN? I would think move DCN into D3 and back out should be enoughif trying to use something to recover.

I am adding Harry and Mario to this email as they are more familiar with this.

I can only see two reasons why you could run into a timeout:

1. A dma_fence never signals.
        How that should be handled is already well documented and doesn't

require

any of this.


Page flip timeouts have nothing to do with fence timeouts.
A page flip timeout can occur even when all fences of all job submissions
complete correctly and on time.


2. A coding error in the vblank or page flip handler leading to waiting
forever. In that case calling back into the driver doesn't help either.


At the moment, a page flip timeout will leave the whole system in a hung state
and the driver does not even attempt to recover it in any way, it just stops
doing anything, which is unacceptable and I'm pretty surprised that it was
left like that for so long.

Note that we have approximately a hundred bug reports open on the drm/amd bug
tracker about "random" page flip timeouts. It affects a lot of users.

Yeah I would much rather leave some messages in the log that thishappened and see a recovery occur than a hang.


So as far as I can see the whole approach doesn't make any sense at all.


Actually this approach was proposed as a solution at XDC 2025 in Harry's
presentation, "DRM calls driver callback to attempt recovery", see page 9 in
this slide deck:

https://indico.freedesktop.org/event/10/contributions/431/attachments/
267/355/2025%20XDC%20Hackfest%20Update%20v1.2.pdf

If you disagree with Harry, please make a counter-proposal.

Hamza - since you seem to have a "workload" that can run overnight andthis series recovers, can you try what Alex said and do a dc_suspend()and dc_resume() for failure?


Make sure you log a message so you can know it worked.


Thanks,
Timur

Since, as it stands it is possible for the display
to stall indefinitely, necessitating a hard reset. So, introduce
a new crtc callback that is called by
drm_atomic_helper_wait_for_flip_done() to give drivers a shot
at recovering from page flip timeouts.

Signed-off-by: Hamza Mahfooz <[email protected]>
---

  drivers/gpu/drm/drm_atomic_helper.c | 6 +++++-
  include/drm/drm_crtc.h              | 9 +++++++++
  2 files changed, 14 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/drm_atomic_helper.c
b/drivers/gpu/drm/drm_atomic_helper.c index 5840e9cc6f66..3a144c324b19
100644
--- a/drivers/gpu/drm/drm_atomic_helper.c
+++ b/drivers/gpu/drm/drm_atomic_helper.c
@@ -1881,9 +1881,13 @@ void drm_atomic_helper_wait_for_flip_done(struct
drm_device *dev,>
                        continue;
                
                ret = wait_for_completion_timeout(&commit->flip_done, 10

* HZ);


-               if (ret == 0)
+               if (!ret) {

                        drm_err(dev, "[CRTC:%d:%s] flip_done timed

out\n",

                        
                                crtc->base.id, crtc->name);

+
+                       if (crtc->funcs->page_flip_timeout)
+                               crtc->funcs-

page_flip_timeout(crtc);

+               }

        }
        
        if (state->fake_commit)

diff --git a/include/drm/drm_crtc.h b/include/drm/drm_crtc.h
index 66278ffeebd6..45dc5a76e915 100644
--- a/include/drm/drm_crtc.h
+++ b/include/drm/drm_crtc.h
@@ -609,6 +609,15 @@ struct drm_crtc_funcs {

                                uint32_t flags, uint32_t target,
                                struct drm_modeset_acquire_ctx

*ctx);


+       /**
+        * @page_flip_timeout:
+        *
+        * This optional hook is called if &drm_crtc_commit.flip_done times

out,

+        * and can be used by drivers to attempt to recover from a page

flip

+        * timeout.
+        */
+       void (*page_flip_timeout)(struct drm_crtc *crtc);
+

        /**
        
         * @set_property:
         *

Re: [PATCH 1/2] drm: introduce page_flip_timeout()

Reply via email to