Re: [PATCH v2] wayland: Add support for eglSwapInterval

2013-09-16 Thread Pekka Paalanen
On Fri, 13 Sep 2013 16:10:25 +0100
Neil Roberts n...@linux.intel.com wrote:

 Thanks for the explanation. I wasn't considering the fullscreen case
 where the compositor is directly scanning out from the client's buffers.
 I think for the non-fullscreen case the compositor would only hold on to
 a single buffer, right? In that case the sync request is enough.

It depends. If a client commits a buffer while the compositor is already
waiting for the previous repaint and pageflip to complete, the
compositor cannot release the old buffer, because the GPU might still
be compositing from it, even in the GL case (texturing). (I don't know
if DRM internally has any ordering guarantees here. If it has, it
should be explicitly written out in comments in the relevant Mesa code.
Developers of other EGL stacks look into Mesa for examples.)

I am also not sure what the current Weston implementation does in this
situation.

Also, there are platforms (rpi) where all non-shm buffers are scanout
buffers. Would be nice to get some guidelines here, where and what
implementation are usually expected to wait on. And on rpi we really do
not have too much memory available.

 I think in an ideal world you would still only need three buffers in the
 fullscreen case. Ie, you have one buffer that is currently used for
 scanout, one buffer that is pending and one buffer that you are
 rendering to. When you are finished rendering to that last buffer
 ideally the compositor would be able to cancel that pending buffer and
 instead queue the new buffer. As far as I understand, there is currently
 no way to cancel that pending buffer in KMS so that is why we need a
 fourth buffer.

Right. (Ignoring the additional case of synchronized sub-surface.)

 I'm not sure whether using four or more buffers is a good idea because
 that seems like quite a lot considering these are fullscreen buffers.
 Maybe in that case we should just give up and block. I guess it depends
 on whether we expect to be able to change KMS to allow cancelling the
 pending buffer, but it sounds like that could be quite hard.

It depends on how far you want to go with the swapinterval=0 case. If
an application is indeed going to max out the fps at any cost, then
I think we should do that. And if an application is using
swapinterval=0 only as means to avoid blocking in EGL while throttling
itself to frame callbacks or something, it probably won't ever allocate
so many buffers.

On a PC platform, I'd easily tend to say let's have 10 buffers slots,
allocate and free them as needed, and only block when we run out.
Anyone requesting swapiterval=0 would get exactly what they want, since
rendering a lot more frames than a compositor can show in one output
frame time is silly to begin with. It is quite convoluted to figure out
the minimum required number of buffer slots to avoid blocking on
compositor refresh, and it can differ between compositors.

You might also be able to free extra buffers, that is shrink and not
only grow the buffer pool. Does that happen in Mesa btw?

But, that's just hand-waving. T think I'd prefer testing with different
numbers in different scanout situations.

 As for flushing the delayed buffer release event, maybe we could add a
 way to disable the event queuing if Weston knows that the client doesn't
 have a frame callback installed? I am about to attach two patches that
 would implement that.

I can't say if that is a good heuristic or not, to assume that surfaces
where a client is not interested in frame callbacks need buffer
releases ASAP. Apart from an explicit request, I have no better
suggestions.


Thanks,
pq

 Pekka Paalanen ppaala...@gmail.com writes:
 
  No, you generally don't get the release event as an immediate reply to a
  commit request.
 
  1. When you commit a new buffer, compositor schedules a repaint, and
  continues processing your requests. If you simply asked for one
  wl_display.sync right after the commit, it is acked now.
 
  2. Some time later, the compositor actually enters repaint. It submits
  a command stream to the GPU, and schedules a page flip onto the screen.
  At this point the compositor sends the frame callbacks, if you
  requested one. Then the compositor continues processing requests in the
  mean time.
 
  3. Some more time later, the GPU is done, and the page flip is done.
  The compositor gets an async notification that the page flip is
  complete. Only at *this point* is the compositor able to send a release
  for the old buffers.
 
  So you see, step 3 comes much later than your sync for the commit. The
  compositor cannot signal the release of any buffers before the page
  flip is done, because the GPU or scanout may be using the buffers
  still, and you risk re-using a buffer while it being read.
 
  The step 3 is pratically guaranteed to come, the wait for it is not
  indefinite, but it is not immediate either. It is not synchronized to
  the Wayland protocol, like many other replies are.
 
  The only time 

Re: [PATCH v2] wayland: Add support for eglSwapInterval

2013-09-13 Thread Pekka Paalanen
On Thu, 12 Sep 2013 16:58:56 +0100
Neil Roberts n...@linux.intel.com wrote:

 I like Kristian's proposal to throttle the swap buffers to sync
 callbacks. It has the added benefit that we can stop the client from
 unnecessarily using 3 buffers by waiting for the sync event in
 get_back_bo. The previous patch would cause the client to use three
 buffers because it would only block when all three buffer slots are
 full. It also means if the compositor is doing something weird and
 keeping all three buffers then the render will fail instead of
 blocking forever, which I think makes more sense.
 
 Daniel, is there any situation where the compositor would keep hold of
 more than one before? If not, surely whenever you attach and commit a
 new buffer you will get the release event for the old buffer
 immediately? If that's the case then I don't think there'd be any need
 to periodically send sync requests to the compositor.

No, you generally don't get the release event as an immediate reply to a
commit request.

1. When you commit a new buffer, compositor schedules a repaint, and
continues processing your requests. If you simply asked for one
wl_display.sync right after the commit, it is acked now.

2. Some time later, the compositor actually enters repaint. It submits
a command stream to the GPU, and schedules a page flip onto the screen.
At this point the compositor sends the frame callbacks, if you
requested one. Then the compositor continues processing requests in the
mean time.

3. Some more time later, the GPU is done, and the page flip is done.
The compositor gets an async notification that the page flip is
complete. Only at *this point* is the compositor able to send a release
for the old buffers.

So you see, step 3 comes much later than your sync for the commit. The
compositor cannot signal the release of any buffers before the page
flip is done, because the GPU or scanout may be using the buffers
still, and you risk re-using a buffer while it being read.

The step 3 is pratically guaranteed to come, the wait for it is not
indefinite, but it is not immediate either. It is not synchronized to
the Wayland protocol, like many other replies are.

The only time when you could get the release for the old buffer as an
immediate reply is, when the old buffer has never been on screen yet.
Provided you also did a wl_display.sync after the commit.

Also don't forget about sub-surfaces, which may hold one additional
buffer indefinitely (until the parent surface is committed).

Unfortunately, I do not have any good suggestions how to solve this,
other than sending the release with 'post', not 'queue'. Could we
perhaps use the hypothetical presentation time stamp event, that would
be fired at step 3, to flush out the release events for a client?

(To complement the frame callback which tells when a client should
start preparing a new frame, there was a plan to add a presentation
timestamp callback to be used for AV-sync etc. telling the time the
frame actually turned into light. I don't know what the status of that
proposal is nowadays.)


Hmm, or could we simply rely on the never been on screen behaviour?
Does that make the wl_display.sync proposition work after all?
I'm a little sceptical, but can't think far enough...

Re-thinking again, I would need to re-check the code whether the never
been on screen actually applies for non-sub-surfaces. In theory it
should.


Thanks,
pq

 It doesn't look like the old patch has landed in master yet, so here
 is a version 2 of the patch which implements Kristian's suggestion.
 
 Regards,
 - Neil
 
 -- 8 --
 
 The Wayland EGL platform now respects the eglSwapInterval value. The
 value is clamped to either 0 or 1 because it is difficult (and
 probably not useful) to sync to more than 1 redraw.
 
 The main change is that if the swap interval is 0 then instead of
 installing a frame callback it will just call the display sync method
 and throttle itself to that. It needs to at least throttle itself to
 that so that it will still be sure to receive buffer release events.
 
 This also moves the vblank configuration defines from platform_x11.c
 to the common egl_dri2.h header so they can be shared by both
 platforms. ---
  src/egl/drivers/dri2/egl_dri2.h |   7 ++
  src/egl/drivers/dri2/platform_wayland.c | 127
 ++--
 src/egl/drivers/dri2/platform_x11.c |   6 -- 3 files changed, 126
 insertions(+), 14 deletions(-)
 
 diff --git a/src/egl/drivers/dri2/egl_dri2.h
 b/src/egl/drivers/dri2/egl_dri2.h index fba5f81..cc657ba 100644
 --- a/src/egl/drivers/dri2/egl_dri2.h
 +++ b/src/egl/drivers/dri2/egl_dri2.h
 @@ -175,6 +175,7 @@ struct dri2_egl_surface
 intdx;
 intdy;
 struct wl_callback*frame_callback;
 +   struct wl_callback*throttle_callback;
 int format;
  #endif
  
 @@ -221,6 +222,12 @@ struct dri2_egl_image
 __DRIimage *dri_image;
  };
  
 +/* From 

Re: [PATCH v2] wayland: Add support for eglSwapInterval

2013-09-13 Thread Neil Roberts
Thanks for the explanation. I wasn't considering the fullscreen case
where the compositor is directly scanning out from the client's buffers.
I think for the non-fullscreen case the compositor would only hold on to
a single buffer, right? In that case the sync request is enough.

I think in an ideal world you would still only need three buffers in the
fullscreen case. Ie, you have one buffer that is currently used for
scanout, one buffer that is pending and one buffer that you are
rendering to. When you are finished rendering to that last buffer
ideally the compositor would be able to cancel that pending buffer and
instead queue the new buffer. As far as I understand, there is currently
no way to cancel that pending buffer in KMS so that is why we need a
fourth buffer.

I'm not sure whether using four or more buffers is a good idea because
that seems like quite a lot considering these are fullscreen buffers.
Maybe in that case we should just give up and block. I guess it depends
on whether we expect to be able to change KMS to allow cancelling the
pending buffer, but it sounds like that could be quite hard.

As for flushing the delayed buffer release event, maybe we could add a
way to disable the event queuing if Weston knows that the client doesn't
have a frame callback installed? I am about to attach two patches that
would implement that.

Regards,
- Neil

Pekka Paalanen ppaala...@gmail.com writes:

 No, you generally don't get the release event as an immediate reply to a
 commit request.

 1. When you commit a new buffer, compositor schedules a repaint, and
 continues processing your requests. If you simply asked for one
 wl_display.sync right after the commit, it is acked now.

 2. Some time later, the compositor actually enters repaint. It submits
 a command stream to the GPU, and schedules a page flip onto the screen.
 At this point the compositor sends the frame callbacks, if you
 requested one. Then the compositor continues processing requests in the
 mean time.

 3. Some more time later, the GPU is done, and the page flip is done.
 The compositor gets an async notification that the page flip is
 complete. Only at *this point* is the compositor able to send a release
 for the old buffers.

 So you see, step 3 comes much later than your sync for the commit. The
 compositor cannot signal the release of any buffers before the page
 flip is done, because the GPU or scanout may be using the buffers
 still, and you risk re-using a buffer while it being read.

 The step 3 is pratically guaranteed to come, the wait for it is not
 indefinite, but it is not immediate either. It is not synchronized to
 the Wayland protocol, like many other replies are.

 The only time when you could get the release for the old buffer as an
 immediate reply is, when the old buffer has never been on screen yet.
 Provided you also did a wl_display.sync after the commit.

 Also don't forget about sub-surfaces, which may hold one additional
 buffer indefinitely (until the parent surface is committed).

 Unfortunately, I do not have any good suggestions how to solve this,
 other than sending the release with 'post', not 'queue'. Could we
 perhaps use the hypothetical presentation time stamp event, that would
 be fired at step 3, to flush out the release events for a client?

 (To complement the frame callback which tells when a client should
 start preparing a new frame, there was a plan to add a presentation
 timestamp callback to be used for AV-sync etc. telling the time the
 frame actually turned into light. I don't know what the status of that
 proposal is nowadays.)


 Hmm, or could we simply rely on the never been on screen behaviour?
 Does that make the wl_display.sync proposition work after all?
 I'm a little sceptical, but can't think far enough...

 Re-thinking again, I would need to re-check the code whether the never
 been on screen actually applies for non-sub-surfaces. In theory it
 should.


 Thanks,
 pq
-
Intel Corporation (UK) Limited
Registered No. 1134945 (England)
Registered Office: Pipers Way, Swindon SN3 1RJ
VAT No: 860 2173 47

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.

___
wayland-devel mailing list
wayland-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/wayland-devel


[PATCH v2] wayland: Add support for eglSwapInterval

2013-09-12 Thread Neil Roberts
I like Kristian's proposal to throttle the swap buffers to sync
callbacks. It has the added benefit that we can stop the client from
unnecessarily using 3 buffers by waiting for the sync event in
get_back_bo. The previous patch would cause the client to use three
buffers because it would only block when all three buffer slots are
full. It also means if the compositor is doing something weird and
keeping all three buffers then the render will fail instead of
blocking forever, which I think makes more sense.

Daniel, is there any situation where the compositor would keep hold of
more than one before? If not, surely whenever you attach and commit a
new buffer you will get the release event for the old buffer
immediately? If that's the case then I don't think there'd be any need
to periodically send sync requests to the compositor.

It doesn't look like the old patch has landed in master yet, so here
is a version 2 of the patch which implements Kristian's suggestion.

Regards,
- Neil

-- 8 --

The Wayland EGL platform now respects the eglSwapInterval value. The value is
clamped to either 0 or 1 because it is difficult (and probably not useful) to
sync to more than 1 redraw.

The main change is that if the swap interval is 0 then instead of installing a
frame callback it will just call the display sync method and throttle itself
to that. It needs to at least throttle itself to that so that it will still be
sure to receive buffer release events.

This also moves the vblank configuration defines from platform_x11.c to the
common egl_dri2.h header so they can be shared by both platforms.
---
 src/egl/drivers/dri2/egl_dri2.h |   7 ++
 src/egl/drivers/dri2/platform_wayland.c | 127 ++--
 src/egl/drivers/dri2/platform_x11.c |   6 --
 3 files changed, 126 insertions(+), 14 deletions(-)

diff --git a/src/egl/drivers/dri2/egl_dri2.h b/src/egl/drivers/dri2/egl_dri2.h
index fba5f81..cc657ba 100644
--- a/src/egl/drivers/dri2/egl_dri2.h
+++ b/src/egl/drivers/dri2/egl_dri2.h
@@ -175,6 +175,7 @@ struct dri2_egl_surface
intdx;
intdy;
struct wl_callback*frame_callback;
+   struct wl_callback*throttle_callback;
int   format;
 #endif
 
@@ -221,6 +222,12 @@ struct dri2_egl_image
__DRIimage *dri_image;
 };
 
+/* From xmlpool/options.h, user exposed so should be stable */
+#define DRI_CONF_VBLANK_NEVER 0
+#define DRI_CONF_VBLANK_DEF_INTERVAL_0 1
+#define DRI_CONF_VBLANK_DEF_INTERVAL_1 2
+#define DRI_CONF_VBLANK_ALWAYS_SYNC 3
+
 /* standard typecasts */
 _EGL_DRIVER_STANDARD_TYPECASTS(dri2_egl)
 _EGL_DRIVER_TYPECAST(dri2_egl_image, _EGLImage, obj)
diff --git a/src/egl/drivers/dri2/platform_wayland.c 
b/src/egl/drivers/dri2/platform_wayland.c
index ffc5959..1cb92ac 100644
--- a/src/egl/drivers/dri2/platform_wayland.c
+++ b/src/egl/drivers/dri2/platform_wayland.c
@@ -180,8 +180,16 @@ dri2_create_window_surface(_EGLDriver *drv, _EGLDisplay 
*disp,
   _EGLConfig *conf, EGLNativeWindowType window,
   const EGLint *attrib_list)
 {
-   return dri2_create_surface(drv, disp, EGL_WINDOW_BIT, conf,
+   struct dri2_egl_display *dri2_dpy = dri2_egl_display(disp);
+   _EGLSurface *surf;
+
+   surf = dri2_create_surface(drv, disp, EGL_WINDOW_BIT, conf,
  window, attrib_list);
+
+   if (surf != NULL)
+  drv-API.SwapInterval(drv, disp, surf, dri2_dpy-default_swap_interval);
+
+   return surf;
 }
 
 /**
@@ -216,6 +224,8 @@ dri2_destroy_surface(_EGLDriver *drv, _EGLDisplay *disp, 
_EGLSurface *surf)
 
if (dri2_surf-frame_callback)
   wl_callback_destroy(dri2_surf-frame_callback);
+   if (dri2_surf-throttle_callback)
+  wl_callback_destroy(dri2_surf-throttle_callback);
 
if (dri2_surf-base.Type == EGL_WINDOW_BIT) {
   dri2_surf-wl_win-private = NULL;
@@ -261,8 +271,20 @@ get_back_bo(struct dri2_egl_surface *dri2_surf, 
__DRIbuffer *buffer)
__DRIimage *image;
int i, name, pitch;
 
-   /* There might be a buffer release already queued that wasn't processed */
-   wl_display_dispatch_queue_pending(dri2_dpy-wl_dpy, dri2_dpy-wl_queue);
+   if (dri2_surf-throttle_callback == NULL) {
+  /* There might be a buffer release already queued that wasn't processed 
*/
+  wl_display_dispatch_queue_pending(dri2_dpy-wl_dpy, dri2_dpy-wl_queue);
+   } else {
+  /* If we aren't throttling to the frame callbacks then the compositor
+   * may have sent a release event after the last attach so we'll wait
+   * until the sync for the commit request completes in order to have the
+   * best chance of reusing a buffer */
+  do {
+ if (wl_display_dispatch_queue(dri2_dpy-wl_dpy,
+   dri2_dpy-wl_queue) == -1)
+return EGL_FALSE;
+  } while (dri2_surf-throttle_callback != NULL);
+   }
 
if (dri2_surf-back == NULL) {
   for (i = 0; i