On Thu, Oct 02, 2025 at 02:40:05PM +0300, Ville Syrjälä wrote:
> On Wed, Oct 01, 2025 at 07:57:23PM -0700, Chintan Patel wrote:
> > When wait_event_timeout() in drm_wait_one_vblank() times out, the
> > current WARN can cause unnecessary kernel panics in environments
> > with panic_on_warn set (e.g. CI, fuzzing). These timeouts can happen
> > under scheduler pressure or from invalid userspace calls, so they are
> > not always a kernel bug.
> 
> "invalid userspace calls" should never reach this far.
> That would be a kernel bug.

I was also wondering how you could get this due to some scheduler
screwup, but I suppose that could theoretically happen with threaded 
irqs, or whatever work/etc is used to update the vblank count on
drivers that don't have hardware interrupts for it. 100+ msec
hw interrupt latency sounds excessive to me though.

But since you reference some syzbot reports below, are you
actually trying to hide real kernel bugs that syzbot found?

> 
> > 
> > Replace the WARN with drm_dbg_kms() messages that provide useful
> > context (last and current vblank counters) without crashing the
> > system. Developers can still enable drm.debug to diagnose genuine
> > problems.
> > 
> > Reported-by: [email protected]
> > Closes: https://syzkaller.appspot.com/bug?extid=147ba789658184f0ce04
> > Tested-by: [email protected]
> > 
> > Signed-off-by: Chintan Patel <[email protected]>
> > 
> > v2:
> >  - Drop unnecessary in-code comment (suggested by Thomas Zimmermann)
> >  - Remove else branch, only log timeout case
> > ---
> >  drivers/gpu/drm/drm_vblank.c | 9 +++++++--
> >  1 file changed, 7 insertions(+), 2 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/drm_vblank.c b/drivers/gpu/drm/drm_vblank.c
> > index 46f59883183d..a94570668cba 100644
> > --- a/drivers/gpu/drm/drm_vblank.c
> > +++ b/drivers/gpu/drm/drm_vblank.c
> > @@ -1289,7 +1289,7 @@ void drm_wait_one_vblank(struct drm_device *dev, 
> > unsigned int pipe)
> >  {
> >     struct drm_vblank_crtc *vblank = drm_vblank_crtc(dev, pipe);
> >     int ret;
> > -   u64 last;
> > +   u64 last, curr;
> >  
> >     if (drm_WARN_ON(dev, pipe >= dev->num_crtcs))
> >             return;
> > @@ -1305,7 +1305,12 @@ void drm_wait_one_vblank(struct drm_device *dev, 
> > unsigned int pipe)
> >                              last != drm_vblank_count(dev, pipe),
> >                              msecs_to_jiffies(100));
> >  
> > -   drm_WARN(dev, ret == 0, "vblank wait timed out on crtc %i\n", pipe);
> > +   curr = drm_vblank_count(dev, pipe);
> > +
> > +   if (ret == 0) {
> > +           drm_dbg_kms(dev, "WAIT_VBLANK: timeout crtc=%d, last=%llu, 
> > curr=%llu\n",
> > +                   pipe, last, curr);
> 
> It should at the very least be a drm_err(). Though the backtrace can
> be useful in figuring out where the problem is coming from, so not
> too happy about this change.
> 
> > +   }
> >  
> >     drm_vblank_put(dev, pipe);
> >  }
> > -- 
> > 2.43.0
> 
> -- 
> Ville Syrjälä
> Intel

-- 
Ville Syrjälä
Intel

Reply via email to