Re: [Intel-gfx] Incorrect plane programming sequence results into a corrupted display/hard hung system

2018-04-17 Thread Ville Syrjälä
On Thu, Apr 12, 2018 at 07:28:48PM +, Runyan, Arthur J wrote:
> This seems like a typical atomic modeset requirement.
> 
> IPC should not impact register programming.
> Vblank evasion only works if you have a guarantee on worst case 
> interrupts/delays.  I think locks is part of the guarantee.
> Double buffer control should guarantee safe alignment of programming across 
> planes on the same pipe.

This is the part that troubles me. This was supposedly tested, but
apparently it didn't help. Are there some relevant registers that
don't respect the double buffer control?

> Multiple pipes will still require a wait for vblank.

I don't think the problems should be related to multiple pipes. We
shouldn't be overlapping any allocations between planes on different
pipes while they're running, and the pipe enable/disable code should
be sequencing things correctly (with appropriate vblank waits) to
avoid overlaps.

> 
> From: Vyas, Tarun
> Sent: Thursday, 12 April, 2018 9:57 AM
> To: intel-gfx@lists.freedesktop.org
> Cc: Runyan, Arthur J ; Shaikh, Azhar 
> ; Herbert, Marc ; Ciobanu, 
> Nathan D ; Lankhorst, Maarten 
> 
> Subject: Incorrect plane programming sequence results into a corrupted 
> display/hard hung system
> 
> On KBL platforms, with HW overlay and/or PSR2, a hard hang with corrupted 
> display is observed while running tests that frequently disable/re-enable 
> primary/overlay planes. Details recorded in this FDO bug: 
> https://bugs.freedesktop.org/show_bug.cgi?id=104975.
> 
> The issue has been root caused as a race where only partial register updates 
> get latched on the next vblank, specifically, the updates that give the 
> buffer allocation of the current plane before disabling it (again, details 
> captured in the FDO bug above). There have been several optimizations to work 
> around this bug:
> 
> 1.   Enable Isochronous priority control (IPC)
> 
> 2.   Increase the vblank evasion time to 250 usec (We have tried 500 usec 
> but that doesn't helps).
> 
> 3.   Disable DOUBLE_BUFFER_CTL while the updates are done, inside 
> intel_pipe_update_start and intel_pipe_update_end (doesn't helps)
> 
> 4.   Grab all the required locks before starting the pipe_update.
> 
> Per Ville, none of the above optimizations guarantee a *full* update before 
> the vblank. As a result, to fix this issue the right way, the plane 
> programming sequence needs to be altered in the driver as mentioned below:
> 
> "Buffer allocation overlap among enabled planes will cause a full frame 
> underrun, and that becomes a hard hange if pkgC or SAGV are enabled.
> You need to make sure the plane is disabled before reallocating the buffer it 
> uses.  For a single pipe it is sufficient to initiate the disabling of the 
> plane before the reallocation.  For multiple pipes it can be more complex.
> In this case you should be doing something like this to ensure plane 2A turns 
> off before plane 1A steals the buffer
> 
> 
> 1. PLANE_CTL_2A -> disabled
> 
> 2.   PLANE_SURF_2A:  touch to arm double buffer update
> 
> 3.   PLANE_BUF_CFG_1A -> (0-860)
> 
> 4.   PLANE_SURF_1A: touch to arm double buffer update
> If the planes are on different pipes there needs to be a wait for vblank 
> between step 2 and 3 to ensure the plane 2A disable completed."
> 
> 
> Please comment as required.

> ___
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx


-- 
Ville Syrjälä
Intel
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] Incorrect plane programming sequence results into a corrupted display/hard hung system

2018-04-12 Thread Runyan, Arthur J
This seems like a typical atomic modeset requirement.

IPC should not impact register programming.
Vblank evasion only works if you have a guarantee on worst case 
interrupts/delays.  I think locks is part of the guarantee.
Double buffer control should guarantee safe alignment of programming across 
planes on the same pipe.  Multiple pipes will still require a wait for vblank.

From: Vyas, Tarun
Sent: Thursday, 12 April, 2018 9:57 AM
To: intel-gfx@lists.freedesktop.org
Cc: Runyan, Arthur J ; Shaikh, Azhar 
; Herbert, Marc ; Ciobanu, 
Nathan D ; Lankhorst, Maarten 

Subject: Incorrect plane programming sequence results into a corrupted 
display/hard hung system

On KBL platforms, with HW overlay and/or PSR2, a hard hang with corrupted 
display is observed while running tests that frequently disable/re-enable 
primary/overlay planes. Details recorded in this FDO bug: 
https://bugs.freedesktop.org/show_bug.cgi?id=104975.

The issue has been root caused as a race where only partial register updates 
get latched on the next vblank, specifically, the updates that give the buffer 
allocation of the current plane before disabling it (again, details captured in 
the FDO bug above). There have been several optimizations to work around this 
bug:

1.   Enable Isochronous priority control (IPC)

2.   Increase the vblank evasion time to 250 usec (We have tried 500 usec 
but that doesn't helps).

3.   Disable DOUBLE_BUFFER_CTL while the updates are done, inside 
intel_pipe_update_start and intel_pipe_update_end (doesn't helps)

4.   Grab all the required locks before starting the pipe_update.

Per Ville, none of the above optimizations guarantee a *full* update before the 
vblank. As a result, to fix this issue the right way, the plane programming 
sequence needs to be altered in the driver as mentioned below:

"Buffer allocation overlap among enabled planes will cause a full frame 
underrun, and that becomes a hard hange if pkgC or SAGV are enabled.
You need to make sure the plane is disabled before reallocating the buffer it 
uses.  For a single pipe it is sufficient to initiate the disabling of the 
plane before the reallocation.  For multiple pipes it can be more complex.
In this case you should be doing something like this to ensure plane 2A turns 
off before plane 1A steals the buffer


1. PLANE_CTL_2A -> disabled

2.   PLANE_SURF_2A:  touch to arm double buffer update

3.   PLANE_BUF_CFG_1A -> (0-860)

4.   PLANE_SURF_1A: touch to arm double buffer update
If the planes are on different pipes there needs to be a wait for vblank 
between step 2 and 3 to ensure the plane 2A disable completed."


Please comment as required.
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] Incorrect plane programming sequence results into a corrupted display/hard hung system

2018-04-12 Thread Vyas, Tarun
On KBL platforms, with HW overlay and/or PSR2, a hard hang with corrupted 
display is observed while running tests that frequently disable/re-enable 
primary/overlay planes. Details recorded in this FDO bug: 
https://bugs.freedesktop.org/show_bug.cgi?id=104975.

The issue has been root caused as a race where only partial register updates 
get latched on the next vblank, specifically, the updates that give the buffer 
allocation of the current plane before disabling it (again, details captured in 
the FDO bug above). There have been several optimizations to work around this 
bug:

1.   Enable Isochronous priority control (IPC)

2.   Increase the vblank evasion time to 250 usec (We have tried 500 usec 
but that doesn't helps).

3.   Disable DOUBLE_BUFFER_CTL while the updates are done, inside 
intel_pipe_update_start and intel_pipe_update_end (doesn't helps)

4.   Grab all the required locks before starting the pipe_update.

Per Ville, none of the above optimizations guarantee a *full* update before the 
vblank. As a result, to fix this issue the right way, the plane programming 
sequence needs to be altered in the driver as mentioned below:

"Buffer allocation overlap among enabled planes will cause a full frame 
underrun, and that becomes a hard hange if pkgC or SAGV are enabled.
You need to make sure the plane is disabled before reallocating the buffer it 
uses.  For a single pipe it is sufficient to initiate the disabling of the 
plane before the reallocation.  For multiple pipes it can be more complex.
In this case you should be doing something like this to ensure plane 2A turns 
off before plane 1A steals the buffer


1. PLANE_CTL_2A -> disabled

2.   PLANE_SURF_2A:  touch to arm double buffer update

3.   PLANE_BUF_CFG_1A -> (0-860)

4.   PLANE_SURF_1A: touch to arm double buffer update
If the planes are on different pipes there needs to be a wait for vblank 
between step 2 and 3 to ensure the plane 2A disable completed."


Please comment as required.
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx