When a CRTC's display mode carries a too small pixel clock,
drm_calc_timestamping_constants() computes a frame duration that
exceeds INT_MAX. drm_vblank_crtc.framedur_ns becomes negative.
drm_crtc_vblank_start_timer() then arms the vblank hrtimer with this
interval, after which vblank events are no longer delivered. Pending
page flips never complete and the display appears frozen.

This could be triggered on virtio-gpu guests that have dynamic resolution
enabled: when the SPICE agent or the X server resizes the output, it
submits a mode whose pixel clock is off by a factor of 1000, e.g.:

    clock = 406 kHz, htotal = 3152, vtotal = 2148

    framedur_ns = 3152 * 2148 * 1000000 / 406 = 16675852216 ns (~16.7 s)

16675852216 does not fit into an int and wraps to roughly -504000000.
ns_to_ktime() then yields a negative interval and the timer stops working.

Found by bisection, which pointed at commit a036f5fceedb ("drm/virtgpu:
Use vblank timer"). That commit merely made virtio-gpu use the vblank
timer and thereby exposed the pre-existing problem in the timer setup
added by commit 74afeb812850 ("drm/vblank: Add vblank timer").

Reject a non-positive frame duration in drm_crtc_vblank_start_timer() and
return an error. enable_vblank then fails and the driver falls back to
sending the vblank event immediately, as it did before the vblank timer
was introduced. Valid modes are unaffected, and the timer self-heals on
the next mode that has a sane clock.

Fixes: 74afeb812850 ("drm/vblank: Add vblank timer")
Cc: [email protected]
Signed-off-by: Roman Ilin <[email protected]>
---
Notes:

Based on v7.1-rc7. Tested on 6.19 and 7.1-rc7.

Open questions:

This relies on the int overflow producing a negative value. The deeper
issue is that drm_calc_timestamping_constants() truncates framedur_ns to
int. Would you prefer to widen framedur_ns to s64, or to bound the
interval here (e.g. reject framedur_ns above one second) so that any
bogus interval is rejected regardless of sign?

Should virtio-gpu additionally sanitize the user-supplied clock in its
atomic_check (similar to vmwgfx for the clock==0 case) so the
vblank-timer throttling is preserved for these resizes, instead of
falling back to immediate events?

 drivers/gpu/drm/drm_vblank.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/drivers/gpu/drm/drm_vblank.c b/drivers/gpu/drm/drm_vblank.c
index f78bf37f1..557cd0bc8 100644
--- a/drivers/gpu/drm/drm_vblank.c
+++ b/drivers/gpu/drm/drm_vblank.c
@@ -2235,7 +2235,13 @@ int drm_crtc_vblank_start_timer(struct drm_crtc *crtc)
 
        drm_calc_timestamping_constants(crtc, &crtc->mode);
 
+       /*
+        * Return an error so the driver falls back to sending vblank events
+        * when a small mode clock yields a frame duration exceeding INT_MAX.
+        */
+       if (vblank->framedur_ns <= 0)
+               return -EINVAL;
+
        spin_lock_irqsave(&vtimer->interval_lock, flags);
        vtimer->interval = ns_to_ktime(vblank->framedur_ns);
        spin_unlock_irqrestore(&vtimer->interval_lock, flags);
-- 
2.54.0


Reply via email to