Re: [Intel-gfx] [PATCH 2/3] ddx/dri2: Repair broken pageflip swap scheduling.

2012-12-12 Thread Mario Kleiner

On 11.12.12 00:00, Jesse Barnes wrote:

On Mon, 10 Dec 2012 10:48:29 -0800
Jesse Barnes jbar...@virtuousgeek.org wrote:


On 15.10.12 16:52, Chris Wilson wrote:
   On Mon, 15 Oct 2012 16:46:52 +0200, Mario Kleiner
mario.klei...@tuebingen.mpg.de wrote:
   Hi Chris,
  
   can you please check  merge at least the first two patches of the
   series into the intel ddx?

Thanks for the quick reply.

  
   The first is along the right path, but I think you want to base the
   split on divisor==0.

I don't think so, unless i misunderstand what you mean? The optimization
of I830DRI2ScheduleFlip()'ing immediately should only happen in the case
where current_msc is = target_msc, ie., if the client really requests
swap at next possible vblank, regardless what divisor is, once we've
entered the whole ...

if (divisor == 0 || current_msc  *target_msc) {

... block. Checking for divisor == 0 would be a nice little cleanup if
only either (divisor == 0) or (current_msc  *target_msc) could be true.
But it can happen that both (divisor == 0) and (current_msc 
*target_msc) and then a check for (divisor == 0) wouldn't tell you if
target_msc is in the future and the kernel vblank mechanism should
schedule swap in the future, or if it is time for an immediate flip.

Also i tested with various distances between successive swap and with
divisor 0 vs. non-zero, so at least it works as advertised with the
current patch.

So that patch should be ok.


Yeah I don't understand the flip schedule at the top there either; if
target_msc is out in the future, why would we schedule a page flip
immediately just because divisor == 0?

Maybe it should look like this instead?

if (divisor == 0 || current_msc  *target_msc) {
if (divisor == 0  current_msc = *target_msc)
if (flip  I830DRI2ScheduleFlip(intel, draw, swap_info))
return TRUE;


commit 2ab29a1688cd313768d928e87e145570f35b4a70
Author: Jesse Barnes jbar...@virtuousgeek.org
Date:   Mon Dec 10 14:55:32 2012 -0800

 dri2: don't schedule a flip prematurely at ScheduleSwap time

Mario, can you make sure this works for you?



Looks good for my purposes, ie., should fix glXSwapBuffersMscOML(), 
which was my main point of grief, thanks a lot! I'll retest soonish for 
extra paranoia. The divisor == 0 check is not really needed for the 
logic to work, but doesn't hurt and maybe still nice to leave there for 
readability to clarify the condition when the optimization should work.


However, it doesn't fix some cases for normal glXSwapBuffers() with a 
swapinterval  1. That requires always updating *target_msc properly, 
and the early exit if the optimization is taken prevents that. Also the 
swap_info-frame doesn't get updated in this case, which effectively 
disables/skips some correctness test in I830DRI2FlipEventHandler().


If you want to fix those as well you'll basically end up with my 
original patch, which moves the optimization a few lines down in the 
function and adds updating of those two variables, minus lots of 
comments in the code and commit message.


But i'm happy, i mostly need glXSwapBuffersMscOML() and the pageflip 
timestamping to work properly.

-mario
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 2/3] ddx/dri2: Repair broken pageflip swap scheduling.

2012-12-12 Thread Mario Kleiner

On 10.12.12 19:48, Jesse Barnes wrote:

On 15.10.12 16:52, Chris Wilson wrote:
   On Mon, 15 Oct 2012 16:46:52 +0200, Mario Kleiner
mario.klei...@tuebingen.mpg.de wrote:
   Hi Chris,
  
   can you please check  merge at least the first two patches of the
   series into the intel ddx?

Thanks for the quick reply.

  
   The first is along the right path, but I think you want to base the
   split on divisor==0.

I don't think so, unless i misunderstand what you mean? The optimization
of I830DRI2ScheduleFlip()'ing immediately should only happen in the case
where current_msc is = target_msc, ie., if the client really requests
swap at next possible vblank, regardless what divisor is, once we've
entered the whole ...

if (divisor == 0 || current_msc  *target_msc) {

... block. Checking for divisor == 0 would be a nice little cleanup if
only either (divisor == 0) or (current_msc  *target_msc) could be true.
But it can happen that both (divisor == 0) and (current_msc 
*target_msc) and then a check for (divisor == 0) wouldn't tell you if
target_msc is in the future and the kernel vblank mechanism should
schedule swap in the future, or if it is time for an immediate flip.

Also i tested with various distances between successive swap and with
divisor 0 vs. non-zero, so at least it works as advertised with the
current patch.

So that patch should be ok.


Yeah I don't understand the flip schedule at the top there either; if
target_msc is out in the future, why would we schedule a page flip
immediately just because divisor == 0?

Maybe it should look like this instead?

if (divisor == 0 || current_msc  *target_msc) {
if (divisor == 0  current_msc = *target_msc)
if (flip  I830DRI2ScheduleFlip(intel, draw, swap_info))
return TRUE;

(could clean that up a little but it captures the optimization I think
and avoids an extra vblank and potential frame delay.)


   The second is broken in the DRI2 layer, as the
   SwapLimit mechanism exposes the multi-client race to a single client
   (i.e. there is no serialisation between the InvalidateEvent and the
   GetBuffers, and the InvalidateEvent is always broadcast too early.)

Can you explain in more detail? I know there were many problems with the
invalidate events, but thought that's ok now? Multi-client race case
happens when a OpenGL app and a compositor is at work? So basically, the
whole DRI2 setup is fundamentally broken for any swaplimit other than 1
and we have to wait for a future DRI3 to fix this properly?

I didn't see any weird behaviour during testing under both compiz+unity
or without compositor with fullscreen OpenGL + pageflipping. Is this
race common or was i just somehow lucky?


I'm not sure I understand Chris's comment either, however in the
redirected case the compositor is ultimately responsible for when
things show up on the screen, so timestamping is meaningless in that
case.  However for fullscreen, unredirected windows (sounds like that's
what you tested), we should be able to properly queue and stamp things
as they complete AIUI.



Yes. I need it to work for fullscreen, unredirected windows. This is the 
only case where timestamping works reliably atm. My patch only applies 
to page-flipped swaps. In this cases (no compositor, fullscreen or 
unredirected with compositor) i didn't see problems on my system. But if 
it is a race it could be that my app is just not triggering it in 
typical use.


It would be good to know when and when not exactly these races with 
invalidate events can happen?



   The
   third is just plain wrong, as pageflip may be a blocking ioctl (and will
   impose client blocking) so anybody who wants to override SwapBuffersWait
   will also want to prevent the inherent wait due to the flip. In any
   event that option is to only paper over the loose DRI2 specification and
   the lack of client control...

I don't think so: If you want to run non-vsynced/tearing with max.
swaprate for benchmarks etc., then you must set SwapBuffersWait off
and set the drawables swapinterval to zero via some .drmrc setting or
the app calling glXSwapInterval etc. But once you've set the
swapinterval to zero, the x-server *always* does an immediate copy blit
and bypasses the whole vblank events / kms-pageflip mechanism. See


I think this comes down to what people expect of the SwapbuffersWait
option.  Our man page indicates it's simply an anti-tearing feature, so
if you disable it I think users would expect swaps to occur as soon as
possible or when scheduled, regardless of potential tearing.

If we leave flipping enabled, I think we'll often hit cases where a
swap will be delayed until the next vblank (when the flip is latched
into the display engine) rather than occurring immediately or when the
vblank event for it fires, with likely tearing.  Or do you think the
current code handles this case well enough that we can guarantee we
won't be delaying things by a frame sometimes?



My assumption was 

Re: [Intel-gfx] [PATCH 2/3] ddx/dri2: Repair broken pageflip swap scheduling.

2012-12-10 Thread Jesse Barnes
 On 15.10.12 16:52, Chris Wilson wrote:
   On Mon, 15 Oct 2012 16:46:52 +0200, Mario Kleiner 
 mario.klei...@tuebingen.mpg.de wrote:
   Hi Chris,
  
   can you please check  merge at least the first two patches of the
   series into the intel ddx?
 
 Thanks for the quick reply.
 
  
   The first is along the right path, but I think you want to base the
   split on divisor==0.
 
 I don't think so, unless i misunderstand what you mean? The optimization 
 of I830DRI2ScheduleFlip()'ing immediately should only happen in the case 
 where current_msc is = target_msc, ie., if the client really requests 
 swap at next possible vblank, regardless what divisor is, once we've 
 entered the whole ...
 
 if (divisor == 0 || current_msc  *target_msc) {
 
 ... block. Checking for divisor == 0 would be a nice little cleanup if 
 only either (divisor == 0) or (current_msc  *target_msc) could be true. 
 But it can happen that both (divisor == 0) and (current_msc  
 *target_msc) and then a check for (divisor == 0) wouldn't tell you if 
 target_msc is in the future and the kernel vblank mechanism should 
 schedule swap in the future, or if it is time for an immediate flip.
 
 Also i tested with various distances between successive swap and with 
 divisor 0 vs. non-zero, so at least it works as advertised with the 
 current patch.
 
 So that patch should be ok.

Yeah I don't understand the flip schedule at the top there either; if
target_msc is out in the future, why would we schedule a page flip
immediately just because divisor == 0?

Maybe it should look like this instead?

if (divisor == 0 || current_msc  *target_msc) {
if (divisor == 0  current_msc = *target_msc)
if (flip  I830DRI2ScheduleFlip(intel, draw, swap_info))
return TRUE;

(could clean that up a little but it captures the optimization I think
and avoids an extra vblank and potential frame delay.)

   The second is broken in the DRI2 layer, as the
   SwapLimit mechanism exposes the multi-client race to a single client
   (i.e. there is no serialisation between the InvalidateEvent and the
   GetBuffers, and the InvalidateEvent is always broadcast too early.)
 
 Can you explain in more detail? I know there were many problems with the 
 invalidate events, but thought that's ok now? Multi-client race case 
 happens when a OpenGL app and a compositor is at work? So basically, the 
 whole DRI2 setup is fundamentally broken for any swaplimit other than 1 
 and we have to wait for a future DRI3 to fix this properly?
 
 I didn't see any weird behaviour during testing under both compiz+unity 
 or without compositor with fullscreen OpenGL + pageflipping. Is this 
 race common or was i just somehow lucky?

I'm not sure I understand Chris's comment either, however in the
redirected case the compositor is ultimately responsible for when
things show up on the screen, so timestamping is meaningless in that
case.  However for fullscreen, unredirected windows (sounds like that's
what you tested), we should be able to properly queue and stamp things
as they complete AIUI.

   The
   third is just plain wrong, as pageflip may be a blocking ioctl (and will
   impose client blocking) so anybody who wants to override SwapBuffersWait
   will also want to prevent the inherent wait due to the flip. In any
   event that option is to only paper over the loose DRI2 specification and
   the lack of client control...
 
 I don't think so: If you want to run non-vsynced/tearing with max. 
 swaprate for benchmarks etc., then you must set SwapBuffersWait off 
 and set the drawables swapinterval to zero via some .drmrc setting or 
 the app calling glXSwapInterval etc. But once you've set the 
 swapinterval to zero, the x-server *always* does an immediate copy blit 
 and bypasses the whole vblank events / kms-pageflip mechanism. See

I think this comes down to what people expect of the SwapbuffersWait
option.  Our man page indicates it's simply an anti-tearing feature, so
if you disable it I think users would expect swaps to occur as soon as
possible or when scheduled, regardless of potential tearing.

If we leave flipping enabled, I think we'll often hit cases where a
swap will be delayed until the next vblank (when the flip is latched
into the display engine) rather than occurring immediately or when the
vblank event for it fires, with likely tearing.  Or do you think the
current code handles this case well enough that we can guarantee we
won't be delaying things by a frame sometimes?

If so, you can argue that this isn't a performance advantage, but then
it's a rather lame option anyway...  So if/until we have async page
flips, I think the current behavior is probably fine.

Mario, do we have some small tests we can add to piglit for the
timestamping cases?  Looking at simple client code always helps me
understand the complex DDX and server interaction better...

Also, I'm assuming all this works ok with triple buffering disabled?
What about with SNA?

Re: [Intel-gfx] [PATCH 2/3] ddx/dri2: Repair broken pageflip swap scheduling.

2012-12-10 Thread Jesse Barnes
On Mon, 10 Dec 2012 10:48:29 -0800
Jesse Barnes jbar...@virtuousgeek.org wrote:

  On 15.10.12 16:52, Chris Wilson wrote:
On Mon, 15 Oct 2012 16:46:52 +0200, Mario Kleiner 
  mario.klei...@tuebingen.mpg.de wrote:
Hi Chris,
   
can you please check  merge at least the first two patches of the
series into the intel ddx?
  
  Thanks for the quick reply.
  
   
The first is along the right path, but I think you want to base the
split on divisor==0.
  
  I don't think so, unless i misunderstand what you mean? The optimization 
  of I830DRI2ScheduleFlip()'ing immediately should only happen in the case 
  where current_msc is = target_msc, ie., if the client really requests 
  swap at next possible vblank, regardless what divisor is, once we've 
  entered the whole ...
  
  if (divisor == 0 || current_msc  *target_msc) {
  
  ... block. Checking for divisor == 0 would be a nice little cleanup if 
  only either (divisor == 0) or (current_msc  *target_msc) could be true. 
  But it can happen that both (divisor == 0) and (current_msc  
  *target_msc) and then a check for (divisor == 0) wouldn't tell you if 
  target_msc is in the future and the kernel vblank mechanism should 
  schedule swap in the future, or if it is time for an immediate flip.
  
  Also i tested with various distances between successive swap and with 
  divisor 0 vs. non-zero, so at least it works as advertised with the 
  current patch.
  
  So that patch should be ok.
 
 Yeah I don't understand the flip schedule at the top there either; if
 target_msc is out in the future, why would we schedule a page flip
 immediately just because divisor == 0?
 
 Maybe it should look like this instead?
 
 if (divisor == 0 || current_msc  *target_msc) {
   if (divisor == 0  current_msc = *target_msc)
   if (flip  I830DRI2ScheduleFlip(intel, draw, swap_info))
   return TRUE;

commit 2ab29a1688cd313768d928e87e145570f35b4a70
Author: Jesse Barnes jbar...@virtuousgeek.org
Date:   Mon Dec 10 14:55:32 2012 -0800

dri2: don't schedule a flip prematurely at ScheduleSwap time

Mario, can you make sure this works for you?

Thanks,
-- 
Jesse Barnes, Intel Open Source Technology Center
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 2/3] ddx/dri2: Repair broken pageflip swap scheduling.

2012-10-07 Thread Mario Kleiner
Commit 7538be3315b8683b05e8f6b22023baadcc0bc4da together
with DRI2/OpenGL triple-buffering support added an
optimization which breaks kms pageflip swap scheduling for most
meaningful use cases and thereby violates the OML_sync_control,
GLX_SGI_swap_control, GLX_swap_control and MESA_swap_control
specs, except for the trivial case of a glXSwapBuffers call
with swap_interval == 1.

This small modification allows to keep the optimization in
the intended way, while removing the hilarious side effects
for timing sensitive applications.

Signed-off-by: Mario Kleiner mario.klei...@tuebingen.mpg.de
---
 src/intel_dri.c |   22 +++---
 1 file changed, 19 insertions(+), 3 deletions(-)

diff --git a/src/intel_dri.c b/src/intel_dri.c
index 126acb2..8786bf4 100644
--- a/src/intel_dri.c
+++ b/src/intel_dri.c
@@ -1275,9 +1275,6 @@ I830DRI2ScheduleSwap(ClientPtr client, DrawablePtr draw, 
DRI2BufferPtr front,
 * the swap.
 */
if (divisor == 0 || current_msc  *target_msc) {
-   if (flip  I830DRI2ScheduleFlip(intel, draw, swap_info))
-   return TRUE;
-
vbl.request.type =
DRM_VBLANK_ABSOLUTE | DRM_VBLANK_EVENT | 
pipe_select(pipe);
 
@@ -1295,6 +1292,25 @@ I830DRI2ScheduleSwap(ClientPtr client, DrawablePtr draw, 
DRI2BufferPtr front,
if (current_msc = *target_msc)
*target_msc = current_msc;
 
+   /* If pageflip is requested for the next possible vblank,
+* then avoid the roundtrip to the kernels vblank event
+* delivery and schedule the pageflip for next vblank
+* directly. This can't be done for any other case, as
+* it would violate the OML_sync_control spec and
+* SGI/MESA/GLX_swap_control spec!
+*/
+   if (flip  (current_msc == *target_msc)  
+   I830DRI2ScheduleFlip(intel, draw, swap_info)) {
+   /* Create approximate target vblank of flip-completion,
+* so basic consistency checks and swap_interval still
+* work correctly.
+*/
+   *target_msc += flip;
+   swap_info-frame = *target_msc;
+
+   return TRUE;
+   }
+
vbl.request.sequence = *target_msc;
vbl.request.signal = (unsigned long)swap_info;
ret = drmWaitVBlank(intel-drmSubFD, vbl);
-- 
1.7.10.4

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx