On Thu, Oct 02, 2025 at 02:40:05PM +0300, Ville Syrjälä wrote: > On Wed, Oct 01, 2025 at 07:57:23PM -0700, Chintan Patel wrote: > > When wait_event_timeout() in drm_wait_one_vblank() times out, the > > current WARN can cause unnecessary kernel panics in environments > > with panic_on_warn set (e.g. CI, fuzzing). These timeouts can happen > > under scheduler pressure or from invalid userspace calls, so they are > > not always a kernel bug. > > "invalid userspace calls" should never reach this far. > That would be a kernel bug.
I was also wondering how you could get this due to some scheduler screwup, but I suppose that could theoretically happen with threaded irqs, or whatever work/etc is used to update the vblank count on drivers that don't have hardware interrupts for it. 100+ msec hw interrupt latency sounds excessive to me though. But since you reference some syzbot reports below, are you actually trying to hide real kernel bugs that syzbot found? > > > > > Replace the WARN with drm_dbg_kms() messages that provide useful > > context (last and current vblank counters) without crashing the > > system. Developers can still enable drm.debug to diagnose genuine > > problems. > > > > Reported-by: [email protected] > > Closes: https://syzkaller.appspot.com/bug?extid=147ba789658184f0ce04 > > Tested-by: [email protected] > > > > Signed-off-by: Chintan Patel <[email protected]> > > > > v2: > > - Drop unnecessary in-code comment (suggested by Thomas Zimmermann) > > - Remove else branch, only log timeout case > > --- > > drivers/gpu/drm/drm_vblank.c | 9 +++++++-- > > 1 file changed, 7 insertions(+), 2 deletions(-) > > > > diff --git a/drivers/gpu/drm/drm_vblank.c b/drivers/gpu/drm/drm_vblank.c > > index 46f59883183d..a94570668cba 100644 > > --- a/drivers/gpu/drm/drm_vblank.c > > +++ b/drivers/gpu/drm/drm_vblank.c > > @@ -1289,7 +1289,7 @@ void drm_wait_one_vblank(struct drm_device *dev, > > unsigned int pipe) > > { > > struct drm_vblank_crtc *vblank = drm_vblank_crtc(dev, pipe); > > int ret; > > - u64 last; > > + u64 last, curr; > > > > if (drm_WARN_ON(dev, pipe >= dev->num_crtcs)) > > return; > > @@ -1305,7 +1305,12 @@ void drm_wait_one_vblank(struct drm_device *dev, > > unsigned int pipe) > > last != drm_vblank_count(dev, pipe), > > msecs_to_jiffies(100)); > > > > - drm_WARN(dev, ret == 0, "vblank wait timed out on crtc %i\n", pipe); > > + curr = drm_vblank_count(dev, pipe); > > + > > + if (ret == 0) { > > + drm_dbg_kms(dev, "WAIT_VBLANK: timeout crtc=%d, last=%llu, > > curr=%llu\n", > > + pipe, last, curr); > > It should at the very least be a drm_err(). Though the backtrace can > be useful in figuring out where the problem is coming from, so not > too happy about this change. > > > + } > > > > drm_vblank_put(dev, pipe); > > } > > -- > > 2.43.0 > > -- > Ville Syrjälä > Intel -- Ville Syrjälä Intel
