On 6/14/26 00:44, Roman Ilin wrote:
> When a CRTC's display mode carries a too small pixel clock,
> drm_calc_timestamping_constants() computes a frame duration that
> exceeds INT_MAX. drm_vblank_crtc.framedur_ns becomes negative.
> drm_crtc_vblank_start_timer() then arms the vblank hrtimer with this
> interval, after which vblank events are no longer delivered. Pending
> page flips never complete and the display appears frozen.
Roman, what's the status here? Does the problem still happen with
7.2-rc1? Sounds like it likely will, just want to make sure before
prodding Thomas about it, who authored and committed the change that
causes this regression.
Side note: CCing Peter Arnesen, who seems to be affected as well.
Quoting a message that was CCed to the regression list, but apparently
was rejected there:
"""
Hi everyone,
I am experiencing a similar Wayland display freeze on Kernel 7.0
described in Roman Ilin's recent patch thread ("drm/vblank: Don't arm
vblank timer with invalid frame duration").
Since my laptop consistently reproduces this freeze on boot, I am
reaching out to offer my hardware for testing if you need verification
that this patch resolves the issue on the newer Ryzen AI architectures.
My System:
* CPU / iGPU: AMD Ryzen AI 9 HX
* dGPU: NVIDIA GeForce RTX 5070
* OS: Fedora 44, Kubuntu 26.04 and Nobara Linux (Fedora-based) / Wayland
* Regression Status: Works perfectly on Kernel 6.19. Fails on Kernel 7.0.
* Tested Workarounds: Kernel parameters amdgpu.dcdebugmask=0x10, 0x410,
and amdgpu.sg_display=0 do not bypass the freeze.
Please let me know if you need me to pull specific dmesg logs or similar
from the frozen state, or if there is a specific patched branch you
would like me to compile and boot to verify the fix.
Regards
Peter
"""
Peter, would be great if you could test 7.2-rc1, too. And if you really
think the problem is the same, it might be worth trying Roman's patch.
Ciao, Thorsten
> This could be triggered on virtio-gpu guests that have dynamic resolution
> enabled: when the SPICE agent or the X server resizes the output, it
> submits a mode whose pixel clock is off by a factor of 1000, e.g.:
>
> clock = 406 kHz, htotal = 3152, vtotal = 2148
>
> framedur_ns = 3152 * 2148 * 1000000 / 406 = 16675852216 ns (~16.7 s)
>
> 16675852216 does not fit into an int and wraps to roughly -504000000.
> ns_to_ktime() then yields a negative interval and the timer stops working.
>
> Found by bisection, which pointed at commit a036f5fceedb ("drm/virtgpu:
> Use vblank timer"). That commit merely made virtio-gpu use the vblank
> timer and thereby exposed the pre-existing problem in the timer setup
> added by commit 74afeb812850 ("drm/vblank: Add vblank timer").
>
> Reject a non-positive frame duration in drm_crtc_vblank_start_timer() and
> return an error. enable_vblank then fails and the driver falls back to
> sending the vblank event immediately, as it did before the vblank timer
> was introduced. Valid modes are unaffected, and the timer self-heals on
> the next mode that has a sane clock.
>
> Fixes: 74afeb812850 ("drm/vblank: Add vblank timer")
> Cc: [email protected]
> Signed-off-by: Roman Ilin <[email protected]>
> ---
> Notes:
>
> Based on v7.1-rc7. Tested on 6.19 and 7.1-rc7.
>
> Open questions:
>
> This relies on the int overflow producing a negative value. The deeper
> issue is that drm_calc_timestamping_constants() truncates framedur_ns to
> int. Would you prefer to widen framedur_ns to s64, or to bound the
> interval here (e.g. reject framedur_ns above one second) so that any
> bogus interval is rejected regardless of sign?
>
> Should virtio-gpu additionally sanitize the user-supplied clock in its
> atomic_check (similar to vmwgfx for the clock==0 case) so the
> vblank-timer throttling is preserved for these resizes, instead of
> falling back to immediate events?
>
> drivers/gpu/drm/drm_vblank.c | 7 +++++++
> 1 file changed, 7 insertions(+)
>
> diff --git a/drivers/gpu/drm/drm_vblank.c b/drivers/gpu/drm/drm_vblank.c
> index f78bf37f1..557cd0bc8 100644
> --- a/drivers/gpu/drm/drm_vblank.c
> +++ b/drivers/gpu/drm/drm_vblank.c
> @@ -2235,7 +2235,13 @@ int drm_crtc_vblank_start_timer(struct drm_crtc *crtc)
>
> drm_calc_timestamping_constants(crtc, &crtc->mode);
>
> + /*
> + * Return an error so the driver falls back to sending vblank events
> + * when a small mode clock yields a frame duration exceeding INT_MAX.
> + */
> + if (vblank->framedur_ns <= 0)
> + return -EINVAL;
> +
> spin_lock_irqsave(&vtimer->interval_lock, flags);
> vtimer->interval = ns_to_ktime(vblank->framedur_ns);
> spin_unlock_irqrestore(&vtimer->interval_lock, flags);