On 01/25/2016 07:51 PM, Daniel Vetter wrote: > On Mon, Jan 25, 2016 at 05:38:30PM +0100, Mario Kleiner wrote: >> Readding Daniel, which somehow got dropped from the cc. >> >> On 01/25/2016 03:53 PM, Ville Syrjälä wrote: >>> On Mon, Jan 25, 2016 at 02:44:53PM +0100, Mario Kleiner wrote: >>>> >>>> >>>> On 01/25/2016 02:23 PM, Ville Syrjälä wrote: >>>>> On Mon, Jan 25, 2016 at 02:16:45PM +0100, Mario Kleiner wrote: >>>>>> >>>>>> >>>>>> On 01/25/2016 05:15 AM, Michel Dänzer wrote: >>>>>>> On 23.01.2016 00:18, Ville Syrjälä wrote: >>>>>>>> On Fri, Jan 22, 2016 at 12:06:00PM +0900, Michel Dänzer wrote: >>>>>>>>> >>>>>>>>> [ Trimming KDE folks from Cc ] >>>>>>>>> >>>>>>>>> On 21.01.2016 19:09, Daniel Vetter wrote: >>>>>>>>>> On Thu, Jan 21, 2016 at 05:36:46PM +0900, Michel Dänzer wrote: >>>>>>>>>>> On 21.01.2016 16:58, Daniel Vetter wrote: >>>>>>>>>>>> >>>>>>>>>>>> Can you please point me at the vblank on/off jump bug please? >>>>>>>>>>> >>>>>>>>>>> AFAIR I originally reported it in response to >>>>>>>>>>> http://lists.freedesktop.org/archives/dri-devel/2015-August/087841.html >>>>>>>>>>> , but I can't find that in the archives, so maybe that was just on >>>>>>>>>>> IRC. >>>>>>>>>>> See >>>>>>>>>>> http://lists.freedesktop.org/archives/dri-devel/2016-January/099122.html >>>>>>>>>>> . Basically, I ran into the bug fixed by your patch because the >>>>>>>>>>> counter >>>>>>>>>>> jumped forward on every DPMS off, so it hit the 32-bit boundary >>>>>>>>>>> after >>>>>>>>>>> just a few days. >>>>>>>>>> >>>>>>>>>> Ok, so just uncovered the overflow bug. >>>>>>>>> >>>>>>>>> Not sure what you mean by "just", but to be clear: The >>>>>>>>> drm_vblank_on/off >>>>>>>>> counter jumping bug (similar to the bug this thread is about), which >>>>>>>>> exposed the overflow bug, is still alive and kicking in 4.5. It seems >>>>>>>>> to happen when turning off the CRTC: >>>>>>>>> >>>>>>>>> [drm:drm_update_vblank_count] updating vblank count on crtc 0: >>>>>>>>> current=218104694, diff=0, hw=916 hw_last=916 >>>>>>>>> [drm:radeon_get_vblank_counter_kms] crtc 0: dist from vblank start 3 >>>>>>>>> [drm:drm_calc_vbltimestamp_from_scanoutpos] crtc 0 : v 0x7 >>>>>>>>> p(2199,-45)@ 7304.307354 -> 7304.308006 [e 0 us, 0 rep] >>>>>>>>> [drm:radeon_get_vblank_counter_kms] crtc 0: dist from vblank start 3 >>>>>>>>> [drm:drm_update_vblank_count] updating vblank count on crtc 0: >>>>>>>>> current=218104694, diff=16776301, hw=1 hw_last=916 >>>>>>>> >>>>>>>> Not sure what bug we're talking about here, but here the hw counter >>>>>>>> clearly jumps backwards. >>>>>>>> >>>>>>>>> [drm:radeon_get_vblank_counter_kms] Query failed! stat 3 >>>>>>>>> [drm:radeon_get_vblank_counter_kms] Query failed! stat 3 >>>>>>>>> [drm:drm_update_vblank_count] updating vblank count on crtc 1: >>>>>>>>> current=0, diff=0, hw=0 hw_last=0 >>>>>>>>> [drm:radeon_get_vblank_counter_kms] Query failed! stat 3 >>>>>>>>> [drm:radeon_get_vblank_counter_kms] Query failed! stat 3 >>>>>>>>> [drm:drm_update_vblank_count] updating vblank count on crtc 2: >>>>>>>>> current=0, diff=0, hw=0 hw_last=0 >>>>>>>>> [drm:radeon_get_vblank_counter_kms] Query failed! stat 3 >>>>>>>>> [drm:radeon_get_vblank_counter_kms] Query failed! stat 3 >>>>>>>>> [drm:drm_update_vblank_count] updating vblank count on crtc 3: >>>>>>>>> current=0, diff=0, hw=0 hw_last=0 >>>>>>>>> [drm:radeon_get_vblank_counter_kms] Query failed! stat 1 >>>>>>>>> [drm:drm_calc_vbltimestamp_from_scanoutpos] crtc 0 : v 0x1 p(0,0)@ >>>>>>>>> 7304.317140 -> 7304.317140 [e 0 us, 0 rep] >>>>>>>>> [drm:radeon_get_vblank_counter_kms] Query failed! stat 1 >>>>>>>>> [drm:drm_update_vblank_count] updating vblank count on crtc 0: >>>>>>>>> current=234880995, diff=16777215, hw=0 hw_last=1 >>>>>>>> >>>>>>>> Same here. >>>>>>> >>>>>>> At least one of the jumps is expected, because this is around turning >>>>>>> off the CRTC for DPMS off. Don't know yet why there are two jumps back >>>>>>> though. >>>>>>> >>>>>>> >>>>>>>> These things just don't happen on i915 because drm_vblank_off() and >>>>>>>> drm_vblank_on() are always called around the times when the hw counter >>>>>>>> might get reset. Or at least that's how it should be. >>>>>>> >>>>>>> Which is of course the idea of Daniel's patch (which is what I'm getting >>>>>>> the above with) or Mario's patch as well, but clearly something's still >>>>>>> wrong. It's certainly possible that it's something in the driver, but >>>>>>> since calling drm_vblank_pre/post_modeset from the same places seems to >>>>>>> work fine (ignoring the regression discussed in this thread)... Do >>>>>>> drm_vblank_on/off require something else to handle this correctly? >>>>>>> >>>>>>> >>>>>> >>>>>> I suspect it is because vblank_disable_and_save calls >>>>>> drm_update_vblank_count() unconditionally, even if vblank irqs are >>>>>> already off. >>>>>> >>>>>> So on a manual display disable -> reenable you get something like >>>>>> >>>>>> At disable: >>>>>> >>>>>> Call to dpms-off --> atombios_crtc_dpms(DPMS_OFF) --> drm_vblank_off -> >>>>>> vblank_disable_and_save -> irqs off, drm_update_vblank_count() computes >>>>>> final count. >>>>>> >>>>>> Then the crtc is shut down and its hw counter resets to zero. >>>>>> >>>>>> At reenable: >>>>>> >>>>>> Modesetting -> drm_crtc_helper_set_mode -> crtc_funcs->prepare(crtc) -> >>>>>> atombios_crtc_prepare() -> atombios_crtc_dpms(DPMS_OFF) -> >>>>>> drm_vblank_off -> vblank_disable_and_save -> A pointless >>>>>> drm_update_vblank_count() while the hw counter is already reset to zero >>>>>> --> Unwanted counter jump. >>>>>> >>>>>> >>>>>> The problem doesn't happen on a pure modeset to a different video >>>>>> resolution/refresh rate, as then we only have one call into >>>>>> atombios_crtc_dpms(DPMS_OFF). >>>>>> >>>>>> I think the fix is to fix vblank_disable_and_save() to only call >>>>>> drm_update_vblank_count() if vblank irqs get actually disabled, not on >>>>>> no-op calls. I will try that now. >>>>> >>>>> It does that on purpose. Otherwise the vblank counter would appear to >>>>> have stalled while the interrupt was off. >>>>> >>>> >>>> Ok, that's what the comments there say, although i don't see atm. why >>>> that perceived stall would be a big problem. I checked all callers of >>>> vblank_disable_and_save(). They are all careful to not call that >>>> function if vblanks are already disabled. The only exception is >>>> drm_vblank_off(). If drm_vblank_off/on is supposed to protect kms >>>> drivers which have resetting hw counters or other problematic behaviour >>>> during modesets etc. then this will break. E.g., calling the vblank >>>> timestamping stuff is also not safe/well-defined during modesets when >>>> the timestamping constants are not (yet) updated to reflect the new mode >>>> timing of the modeset in progress. >>> >>> The idea is to maintain the appearance that the counter ticks all the >>> time as long as the crtc is active. While that may not be really >>> required in case if no one is currently interested in the vblank >>> counter, I think it's a nice thing to have just to make the behaviour >>> of the counter consistent. >>> >>> As far as calling drm_vblank_off() after the hw counter got reset, well, >>> that not correct. It should be called before the reset. >> >> What radeon does is calling drm_vblank_off at beginning of DPMS_OFF. The >> first call to DMPS_OFF will call drm_vblank_off() and really disable >> vblank-irqs if they were running, updating the counts/ts a last time. But >> then the dpms off will reset the hw counter to zero. When one reenables the >> display, a second call to DPMS_OFF will now call drm_vblank_off again when >> it apparently shouldn't. >> >> I just tested this patch, which fixes the counter jumps on radeon-kms with >> my or Daniel's drm_vblank_off patches to radeon: > > This might be due to the legacy helpers, which just love to redundantly > disable stuff that's off already. The problem I see with no-oping these > out is that for atomic drivers (which really should get this right) this > might paper over bugs: E.g. when you forget to call _off() when disabling > the crtc, then calling _on() twice in a row is indeed a serious bug. > Similar when you forget to call _on() and have multiple _off() calls in a > row. > > So not sure what to do here. > -Daniel >
Yes, the legacy helpers cause two calls to dpms off if one disables a display. First during display disable as intended. Then when one reenables the display during modesetting as part of crtc_funcs->prepare() - at least on radeon. Maybe the minimum thing that would help is to just check for vblank->inmodeset in drm_vblank_off(). If that would be the case we'd know it is a redundant call and could no-op it and do a WARN_ON(vblank->inmodeset)? drm_vblank_on() i don't know how to treat, but that one calls drm_reset_vblank_timestamp() which should be less problematic if called redundantly. Now the patch i want to try next to fix the drm_vblank_pre/post_modeset regression in Linux 4.4/4.5 is to add a ... if ((diff > 1) && vblank->inmodeset) diff = 1; ... to the bottom of drm_update_vblank_count(). That should hopefully restore the pre/post_modeset behavior as close to the original behavior as possible. As a side effect it would also prevent the counter jump caused by redundant calls to drm_vblank_off(). -mario >> >> diff --git a/drivers/gpu/drm/drm_irq.c b/drivers/gpu/drm/drm_irq.c >> index 607f493..d739d93 100644 >> --- a/drivers/gpu/drm/drm_irq.c >> +++ b/drivers/gpu/drm/drm_irq.c >> @@ -1313,7 +1313,10 @@ void drm_vblank_off(struct drm_device *dev, unsigned >> int pipe) >> spin_lock_irqsave(&dev->event_lock, irqflags); >> >> spin_lock(&dev->vbl_lock); >> - vblank_disable_and_save(dev, pipe); >> + DRM_DEBUG_VBL("crtc %d, vblank enabled %d\n", pipe, >> vblank->enabled); >> + >> + if (vblank->enabled) >> + vblank_disable_and_save(dev, pipe); >> wake_up(&vblank->queue); >> >> /* >> @@ -1415,6 +1418,8 @@ void drm_vblank_on(struct drm_device *dev, unsigned >> int pipe) >> return; >> >> spin_lock_irqsave(&dev->vbl_lock, irqflags); >> + DRM_DEBUG_VBL("crtc %d, vblank enabled %d\n", pipe, >> vblank->enabled); >> + >> /* Drop our private "prevent drm_vblank_get" refcount */ >> if (vblank->inmodeset) { >> atomic_dec(&vblank->refcount); >> >> >> >> Another, maybe better, approach might be to no-op redundant calls to >> drm_vblank_off() iff vblank->inmodeset and no-op redundant calls to >> drm_vblank_on() iff !vblank->inmodeset. >> >> -mario >> >> >>> >>>> >>>> -mario >>>> >>>> >>>>>> >>>>>> Otherwise kms drivers would have to be careful to never call >>>>>> drm_vblank_off multiple times before calling drm_vblank_on, but the help >>>>>> text to drm_vblank_on() claims that unbalanced calls to these functions >>>>>> are perfectly fine. >>>>>> >>>>>> -mario >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>> >>> >