Is there a PATCH 2/2 which I can't find, or is the subject wrong? On 01/21/2016 09:16 AM, Mario Kleiner wrote: > The hardware vblank counter of AMD gpu's resets to zero during a > modeset. The new implementation of drm_update_vblank_count() from > commit 4dfd6486 "drm: Use vblank timestamps to guesstimate how > many vblanks were missed", introduced in Linux 4.4, treats that > as a counter wraparound and causes the software vblank counter > to jump forward by a large distance of up to 2^24 counts. This > interacts badly with 32-bit wraparound handling in > drm_handle_vblank_events(), causing that function to no longer > deliver pending vblank events to clients. > > This leads to client hangs especially if clients perform OpenGL > or DRI3/Present animations while a modeset happens and triggers > the hw vblank counter reset. One prominent example is a hang of > KDE Plasma 5's startup progress splash screen during login, making > the KDE session unuseable. > > Another small potential race exists when executing a modeset while > vblank interrupts are enabled or just get enabled: The modeset updates > radeon_crtc->lb_vblank_lead_lines during radeon_display_bandwidth_update, > so if vblank interrupt handling or enable would try to access that variable > multiple times at the wrong moment as part of drm_update_vblank_counter, > while the scanout happens to be within lb_vblank_lead_lines before the > start of vblank, it could cause inconsistent vblank counting and again > trigger a jump of the software vblank counter, causing similar client > hangs. The most easy way to avoid this small race is to not allow > vblank enable or vblank irq's during modeset. > > This patch replaces calls to drm_vblank_pre/post_modeset in the > drivers dpms code with calls to drm_vblank_off/on, as recommended > for drivers with hw counters that reset to zero during modeset. > Those calls disable vblank interrupts during the modeset sequence > and reinitialize vblank counts and timestamps after the modeset > properly, taking hw counter reset into account, thereby fixing > the problem of forward jumping counters. > > During a modeset, calls to drm_vblank_get() will no-op/intentionally > fail, so no vblank events or pageflips can be queued during modesetting. > > Radeons static and dynpm power management uses drm_vblank_get to enable > vblank irqs to synchronize reclocking to start of vblank. If a modeset > would happen in parallel with such a power management action, drm_vblank_get > would be suppressed, sync to vblank wouldn't work and a visual glitch could > happen. However that glitch would hopefully be hidden by the blanking of > the crtc during modeset. A small fix to power management makes sure to > check for this and prevent unbalanced vblank reference counts due to > mismatched drm_vblank_get/put. > > Reported-by: Vlastimil Babka <vbabka at suse.cz> > Signed-off-by: Mario Kleiner <mario.kleiner.de at gmail.com>
FWIW, this seems to work for the kde5 login issue, thanks. Let me know if you need also some specific testing/debug output, or testing another approach if the "drm_vblank_on/off propaganda" is not acceptable :)